














JULY 1956 
J ournal of Volume XII Number 3 


Clinical Psychology 














EDITORIAL BOARD 


FREDERICK C. THORNE, Editor 
Editorial Office: 5 Pearl Street 
Brandon, Vermont 


JERRY W. CARTER, JR. HOWARD F. HUNT WILLIAM A. HUNT 
U. S. Public Health Service University of Chicago Northwestern University 


ELAINE F. KINDER C. M. LOUTTIT ROBERT I. WATSON 


New York State Department of Wayne University Northwestern University 
Mental Hygiene 





ELEANOR C. THORNE, Business Manager 

















Journal of Clinical Psychology 


5 Pearl Street - : Brandon, Vermont 








COMPANION VOLUMES 


PRINCIPLES PRINCIPLES 
OF OF 


PSYCHOLOGICAL PERSONALITY 
EXAMINING COUNSELING 


by by 
FREDERICK C. THORNE, M.D., Ph.D. 
Editor, Journal of Clinical Psychology 





FREDERICK C. THORNE, M.D., Ph.D. 
Editor, Journal of Clinical Psychology 


PRICE: $7.50 PRICE: $6.00 





Systematic: The first comprehensive system of diagnosis and psychotherapy based 
solidly on basic psychological science. 


THEORETICAL: Presenting a comprehensive theory of the nature of personality 


integration with special emphasis on the differential diagnosis and modification 


of the primary etiologic factors which actually organize socially significant 
behaviors. 


Ecutectic: All pertinent theories and facts in basic science psychology and clinical 
psychological science are related in a systematic applied integrative psychology. 
ComPLeEtTE: For all teaching and reference purposes. 


ApvANCED: Both of these books present a new orientation to diagnosis, counseling 
and psychotherapy which is more than just a rehash of the literature. Principles 
of Psychological Examining is more than a textbook of psychometrics since 
tests and measures are presented only as they relate to more fundamental prob- 
lems of psychological diagnosis. Similarly, Principles of Personality Counseling 
presents thorough discussions of the indications and contraindications for apply- 


ing all known methods of psychotherapy after a primary diagnosis of factors 
requiring modification has been accomplished. 





ORDER NOW 





JOURNAL OF CLINICAL PSYCHOLOGY 
5 Pearl Street Brandon, Vermont 








JOURNAL OF 
CLINICAL PSYCHOLOGY 


Vout. XII JuLY 1956 No. 3 








CONTENTS 


Validation and Intensification of the Sixteen Personality Factor Questionnaire. 
RaymMonp B. CaTrELL - - - - - - = - - - - = = = = = 
Trait Judgments from Photographs as a Projective Device. DonaLp T. Camp- 
BELL and LERoy 8. BuRWEN - - - - - - = = 


Research in Clinical Psychology: 1955. W. Grant DAHLSTROM - - - - - 


Speech Behavior and Egocentricity. MArton WHITE McPHERSON - - 


Vocational Tests as Measures of Performance of Schizophrenics in Two Rehab- 
ilitation Activities. Bernarp A. StotsKkKy - - - - = - 


A Study of the Relationships between Self-Ratings and Parent-Ratings for a 
Group of College Students. Louis G. Porter and CHautmers L. Stacey - 


An Experimental Study of the Effects of Individual and Group Presentation of 
the Rorschach Plates. J. H. RohrRER AnD BARBARA W. EDMONSON - - 


Spiral Aftereffect as a Test of Organic Brain Damage. ArTHURJ.GALLESE - 
The Rorschach as a Physiological Stress. Hupson Jost AND LEON J. Epstein - 


The Cornell Medical Index in a Psychiatric Outpatient Clinic. FRANKLYN N. 
ARNHOFF, LA VERN C. StTROUGH AND Ricuarp B. SEYMOUR - - - - 


A Quantitative Method of Scaling Communication and Interaction Process. 
Haroup J. FINE AND CaRL N. ZIMET - - - - - = = = = = = = 
Evaluative Study of One Hundred Transorbital Leucotomies. Matuew D. 
MERMELSTEIN - - - = - = = = = = = = 


Factors Influencing Utilization of Psychotherapeutic Services in Male College 
Students. NorMAN 8. GREENFIELD AND WILLIAM F. Fry - - - - 


MMPI Score Changes Induced by page Acid Dietytoniee GBD -2). 
RicHARD E. BELLEVILLE - - - i = rd 2 
Some Factors Associated with the Visual Threshold for Taboo beatae en W. 
SIEGMAN - - = = = = = = © = © © «© «= «~ 


Anxiety and Goal-Setting Behavior. PHttip HIMELSTEIN - - - - - 


The Repeat Reliability of Clinical Judgments of _ manpenmes. Wriuiam A. 
HunT AND FRANKLYN N. ARNHOFF - - - . 2. ¢ 


Distraction and Affective Disturbance. G. A. Foutps - - - - - - 


(Over) 





CONTENTS—Continued 


Body-Concept Disturbances of Patients with Hemiplegia. FRANKLIN C. 
SHONTZ - - - - = - 2s 2+ 2+ 2+ = = © = = = = we ee 
Personality Correlates of Duodenal Ulcer and other Psychosomatic Syndromes. 
Peter M. LEWINSOHN - - - - - - = = = = = = = = 


Regression or Disintegration in Schizophrenia. Epwarp M. Scorr 





Editorial Opinion - 


Book Notices - - 


GENERAL INFORMATION 


The Journal of Clinical Psychology is an independent journal dedicated to the advancement of the 
clinical method in psychology. Although primarily a scientifically oriented professional journal limited 
to publication of original research reports and authoritative theoretical articles, it aims to foster the 
promotion and expansion of clinical psychology as an applied science. 


Communications and articles submitted for publication are welcome and should be addressed 
to the Editorial Office, 5 Pearl Street, Brandon, Vermont. 


ORIGINAL ARTICLES are published only with the understanding that they are contributed 
exclusively to the Journal of Clinical Psychology. The publishers are not responsible for statements 
made or opinions expressed by contributors in articles printed in these columns. Copyrights cover 
publication in the Journal of Clinical Psychology and articles may not be reproduced without 
permission of the publishers. 


THE STYLE of all manuscripts including bibliographies must be in accordance with the in- 
structions given in: Anderson, J. E., and Valentine, W. L. The preparation of articles for publication 
in the journals of the American Psychological Association. Psychol. Bull., 1944, 41, 345-376. Refer- 
ences 8 chould be arranged alphabetically according to the directions given ‘above. Full address of the 
author should appear somewhere in the article, preferably at the end. 


REPRINTS are furnished on order only and must be requested on the order blank provided 
when ga ef proofs are returned. Change of address notices should be sent one month before moving 
and should include old and new addresses. Undelivered copies will be remailed only at subscribers’ 
expense. 


The Journal of Clinical Psychology is published quarterly, appearing in January, April, July and 
October, at Rutland, Vermont, by Frederick C. Thorne, M.D. Subscription price $7.50; Canadian 
subscriptions $8.00; foreign subscriptions $8.50 including postage; single copies $2.00. Entered as 
second class matter January 15, 1945, at the post office at Burlington, Vermont, under the Act of 
March 3, 1879. Reentered at the post “office at Brandon, Vermont. ‘Additional entry at the post-office 


at Rutland, Vermont. Title registered, U. S. Patent Office. Editorial and Business Offices, 5 Pearl 
Street, Brandon, Vermont. 


Copyright, 1956, Frederick C. Thorne, M. D. 





VALIDATION AND INTENSIFICATION OF THE SIXTEEN PERSONALITY 
FACTOR QUESTIONNAIRE 


RAYMOND B. CATTELL 


Laboratory of Personality Assessment and Group Behavior 
University of Illinois 


THE AIM OF IMPROVING AN INSTRUMENT 


Although the ideal in personality measurement, as in ability measurement, is 
to deal with functionally unitary traits, there are as yet extremely few personality 
factor scales available. The clinical, educational or industrial psychologist who is 
ready for the sophisticated and effective diagnosis and prediction which the use of 
factors—in the specification equation and in pattern functions of factor profiles— 
makes possible, finds available only one instrument of objective factor measure- 
ment‘ and three or four questionnaires “*: !7- 8. 8%), Compared with the former, the 
latter have the virtue of brief and simple administration and the defect of distorta- 
bility, which together permit a widespread usage, but with cooperative subjects 
only. Accordingly, though objective personality factor tests are on the march ©: 6 18), 
cooperative subjects are common enough to justify considering the pencil and paper 
questionnaire as a permanent part of the psychologist’s equipment, and seeking to 
perfect it. This paper is an account of the concepts, methods and results in produc- 
ing a revision of the 16 P. F. Questionnaire. 

The Sixteen Personality Factor Questionnaire, which consists of fifteen temper- 
amental or dynamic factors and one general intelligence factor, has been in use seven 
years”), During that time it has been translated for use in eight countries. It has 
accumulated valuable social validation data in the form of profiles for about thirty 
occupations“? and six clinical and delinquency syndromes“. Certain important 
regression weights of factors on criteria have also been determined, notably for pre- 
dicting certain occupational successes ®! ), accident proneness“*), success in var- 
ious kinds of leadership ®°, the selection of researchers and creative persons“: **), 
and the prediction of that part of educational achievement not due to ability “* *. 

Although the 1948 factorization on which the construction of the 16 P. F. was 
based, and which we shall henceforth call ‘‘the original factor foundation’’, availed 
itself of the most advanced factor techniques then possible, and was based on an 
exceptionally wide area of items, it was the stated intention of the designers at the 
time to re-check the factor structure, later, by cross validation on different popula- 
tions, and by entirely independent rotations. The term validation in the present 
title applies to determining personality factor validity, i.e., internal concept or con- 
struct validity, while the term “intensification” is borrowed from photography, as a 
useful designation in psychometrics for the special, additional process of raising the 
saturation of items on required factors and reducing irrelevent correlations, i.e., 
correlations with factors other than the intended one. Thus, validation concerns the 
confirmation of a factor; while intensification connotes the development of items to 
express it more strongly and distinctly, as happens in intensifying a photographic 
negative. The notion of “homogenizing”’ a scale is not the same as “‘intensifying’’ it, 
for a test may be made more homogeneous without being made more factor-pure, 
and we have argued elsewhere”) that there are systematic psychological reasons 
why this may actually tend to lower factor saturations. Our discussion of test con- 
struction theory and its illustration by a particular case, though centered on valida- 
tion and intensification, makes a complete review of necessary principles in factored 
test construction. 


THe OrIGINAL 16 P. F. FOUNDATION IN THE FULL “‘PERSONALITY SPHERE” 


Emphasis on certain particular standards and requirements in what follows can 
be understood only if the reader is first given some perspective on the emergence of 
the 16 P. F. in relation to basic personality research. For this test is only one part 
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of a whole series of test developments which are unique in that the constructors are 
primarily concerned with basic personality structure and only secondarily with 
test ‘‘gadgets’’ per se. In the first place, the adult 16 P. F. is developed in conjunc- 
tion with other questionnaire constructions, cross-sectioning personality at differ- 
ent age levels, notably at the level of early adolescence“ ; at seven years; and at four 
years"®), It thus implies dependence on research findings and emerging concepts 
about basic personality development. Secondly, at the level of adult cross-section, 
the 16 P. F. is an integrated part of a research advance conceived broadly in terms 
of three possible observation media—life records in situ, questionnaires and ob- 
jective-tests.This attention to breadth of manifestation increases our understanding 
of the primary personality structures in terms of different media and situational 
expressions. 

The 16 P. F. can thus claim a somewhat more intensive and extensive research 
basis than the few excellent factored questionnaires otherwise available, notably in 
(1) the coverage of the personality sphere, which in this case is extended by cross- 
media factorings“*: ®), with the wider reference of meaning thus ensured, and (2) 
a factor loading has been determined for every item, instead of for conglomerate 
blocks of items, by virtue of the special techniques of factoring invented for handling 
large numbers of variables. The resulting better selection of items permits measures 
of higher factor saturation though still with small numbers of items per factor. 

The first of these advances can be briefly substantiated and given essential 
descriptive detail as follows. The original research on verbal responses from which 
the 16 P. F. emerged was based on a population of questionnaire items derived from: 


(1) A complete survey®? of all well known questionnaire, opinionnaire, 
interest and value scales. The evidence! thereof indicated that about twenty 
factors could be discerned as of 1946. Each of these factors was represented in 
the ensuing research by sufficient markers, and by newly invented items directly 
designed to measure the concepts better than by any existing tests. 

(2) Evidence of entirely new personality factors, from non-questionnaire 
sources. In particular, new items were added to the pool of variables to cover 
the fourteen factors found in factoring rated behavior based on the complete 
personality sphere®’, as well as on objective tests. Parenthetically, the inter- 
factor studies“*: and Saunders’ projection of questionnaire factors into be- 
havior rating space, have shown that the questionnaire factors can be matched 
with behavior rating factors much more closely and completely than with the 
objective test factors, at the present stage of the latter. Only life record factors 
D, J, and K are missing from the questionnaire factors and only questionnaire 
factors Q', Q?, Q® and Q‘ are missing from the life record factors. (Hence the 
unique Q designation for these four factors). 


The outcome of the original factoring was a good confirmation both as to num- 
ber and kind of factors, agreeing with the hypothesized twenty from the above broad 
survey of evidence in questionnaire and rating media. Nevertheless it seemed to us 
appropriate and necessary research strategy at that time to drop the three or four 
most poorly defined factors in the original factoring and to build the 16 P. F. from 
intelligence and the clearest fifteen factors. Research should at some time pick up 
the four discards, but it has seemed sufficiently ambitious a task for our laboratory 
to concentrate on the definition of the fifteen, and their internal and social validities. 
— even these sometimes exceed the span of attention of certain applied psycho- 
ogists! 

Now the original structure was based on an 82 x 82 matrix, so the additions 
necessary for the 187 item A and B forms (giving either 10 or 13 items per factor per 

These factors resulted largely from the work of Guilford @?), Ferguson, Humphrey and Strong, 
Flanagan), Layman), Mosier“), Reyburn and Taylor“), Thorndike®), Thurstone“®, and 


Vernon“) and covered, among many others, the data of such tests as the Bernreuter, Bell, Strong, 
Allport-Vernon and other tests. 


TRE ORR ae, 
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scale) were made from items known to correlate with the factors in the survey®? or 
picked up by item analyses against the separate factors. The mean coefficient of 
equivalence for all factors between the two forms was .51, and the mean consistency 
coefficient .68, which may be considered good for 10 to 13 items, but since the simple 
structure showed factors oblique up to about 0.3, better consistencies could be de- 
sired. Accordingly a series of re-factorings, with search for more highly loaded items, 
was planned, as described here, to give the highest possible factor validities for a test 
of this length. 


Canons or Factorep Test CONSTRUCTION 

An ideal factor scale differs from a Walker-Guttman scale: #) only in that it 
yields a measure of a factor instead of a composite of factors. It should meet the 
conditions that (1) all items have the same factor composition, namely that of a pure, 
psychologically meaningful, simple structure factor, and (2) items are graded in 
degree of difficulty according to equal intervals on a normal distribution, since factors 
can be defined as normally distributed. (In the more general context of personality, 
which includes ability, ‘‘difficulty”, which applies only to abilities, is better ex- 
pressed as “eccentricity of cut’’). If the first condition is guaranteed, item compari- 
sons in all possible pairs as in Guttman scaling are unnecessary to ensure the second, 
since grading can be determined from cutting positions on the distribution. On the 
other hand, if it is not, application of the Walker-Guttman condition may, according 
to present experience, prevent anything broader than a scale for a factor of a relative- 
ly specific nature being formed. 

The aim of a multiple factor questionnaire is to form distinct factor scales, in 
this case sixteen, with mutual obliquities no greater and no less than those discovered 
to exist among the simple structure factors themselves. Since perhaps only one item 
in a thousand initially tried ultimately turns out to be a pure factor measure it is 
likely to be several years before a sufficiency of items is discovered to make sixteen 
ideal factor scales. Accordingly at present the aim must be to obtain scales operat- 
ing with suppressors, i.e., obtaining the requisite degree of freedom from factor inter- 
correlation by using the principle of summated factor suppression. This states that 
the collection of items used for one factor scale should have the highest mean loading on 
the required factor consistent with loadings on all other factors summing to zero. For 
example, in a two item (a and b) scale for Factor F;, we should require that if 
a = xF, + sF, then b = yF, -sF». 

In any refined statistical work it is important not to lose sight of basic matters 
of psychology and common sense. For example a highly loaded and statistically 
perfect item in the given student sample is no good if it contains words obscure to 
non-students; contains a reference to an event likely to be unknown a year later; 
has an eccentricity of cut of 95% to 5%, and takes two minutes to read! 

Accordingly the construction of the revised 16 P. F., from start to finish, follow- 
ed the following canons of procedure. 


(1) <A very large number of items (in this case 1552) is made up by at 
least six people (to avoid person-specific factors sometimes demonstrable in 
tests), in the light of all that is known (in questionnaire, rating and 16 P. F. 
criterion prediction data) about the number and nature of the primary person- 
ality factors. 

(2) These are to be submitted to persons of different background, and to 
word count surveys, to eliminate uncommon words (Flesch word count), items 
that are too long, ambiguous or tied to matters too specific in place or time. 

(3) Two population samples are to be taken, one toward the upper and 
one the lower half of the range for which the test is intended and correlation 
matrices are calculated among the items separately for these. 


(4) Items with extreme cuts (under 10% in one end category of three), 
in either sample, are eliminated before the calculation of correlation matrices. 
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The phi co-efficient or the tetrachoric is used. Phi divided by the maximum 
possible phi for the given extremity of cut has been used by us before and, like 
the tetrachoric, has the advantage of getting rid of ‘difficulty factors’ “*), but 
since it is prone to yield non-Gramian matrices, and since the alternative tetra- 
choric involves undue assumptions, the present study used phi. 

(5) The two matrices are separately factored and rotated blindly to 
simple structure. It is very important that the latter be truly and thoroughly 
done. 

(6) Items are picked for each factor having the highest loadings on the 
required factor and, if possible, suppressing, (7.e., cancelling) loadings on the 
others. At this point only those items are carried further which show emphatic 
consistency in their factor patterns in the two studies. For example, no matter 
how significant the positive loading on one study may be, the item would be 
rejected if it has insignificant or negative loading on the other. 

(7) To get suitable means, variance and grading on each factor scale, the 
cuts (alternative response frequencies) must be examined. It is possible to pre- 
dict both mean and variance of the resultant scale, by certain assumptions “®?, 
from the cuts on the included items. The choice of items by cuts should ac- 
cordingly give a mean that is central on the scale range and a maximum scatter 
(near-even cuts) (to an extent compatible with usefulness for extreme samples) 
as well as equal means and variances for the equivalent A and B forms. 

(8) An even balance of ‘Yes’ and ‘No’ answers must be chosen, from the 
surviving items, to score positively on each factor, in order to abolish position 
or response set effeects. 

(9) The items should be symmetrically divided between A and B forms, 
as to factor loading, mean, variance, yes and no answer, etc., as determined 
above. (Partly to ensure the kind of equivalence cited in (7) above). Then they 
need to be arranged in that form of cyclical order, avoiding several items in 
sequence for the same factor, most convenient for the scoring key. 

(10) The scales must be standardized with the usual attention to stratified 
sampling, etc. 


PROCEDURES FOLLOWED IN THE TWO FACTORINGS AND FINAL CONSTRUCTION 


The present revision of the 16 P. F. has in principle followed the above ten 
canons, but economic compromises with the ideal have had to be made in steps 4 and 
6 as will be described. Since at almost every step there is loss from a particular selec- 
tion process, one must start with a far larger number of items than the 374 which in 
this case are intended to constitute the final A and B forms of the 16 factor scale. 
There is no obstacle in starting with quite a large number to be submitted to the 
verdict of the first four canons, and in fact we began with 1552. But owing to the 
bottle neck created by the limits of size of factorizable matrices, every stage beyond 
step 5 tends to suffer more or less grievously from a dearth of items necessary to 
reach the desired standards. Indeed in past test construction, this has proved a well 
nigh insuperable obstacle to producing multi-factor scales with loadings known and 
confirmed for every item. Although the obstacle has not been completely overcome 
here, we feel that the device of ‘‘parcelled factor analysis,’ with the use of extension 
matrices, described here is perhaps the most important technological contribution 
of this article—apart from the finished instrument itself. 

By this device, and the use of the electronic computer, larger initial matrices 
have here been factored than any hitherto reported, and the bottle neck partly 
eliminated. Accordingly, taking steps 1 through 4 as understood in current psycholo- 
gical practice we shall concentrate on the two factor analyses which have succeeded 
the original factor analysis“) and which we shall henceforth refer to as the second 
and third checking analyses. The first of these was done in connection with produc- 
ing a ‘Basic English” Form C of the 16 P. F. and has been described elsewhere“. 
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On 295 men and women undergraduates, it began with 720 items which were reduced 
by steps 1 and 2 to a set of 450, which we shall call Extension Questionnaire A and 
which was reduced to exactly 300 by step 4. At this point a further selection down to 
126 was made (a 300 x 300 matrix being still unmanageable factorially) by taking 
only those items showing significant (P = .01) correlations with factors in the 
existing 16 P. F. or with each other. 

‘Parcelled factoring’ was now carried out for these 126 survivors and the 374 
items of the existing 16 P. F. combined as follows. Each 16 P. F. factor was entered 
as two variables (the minimum for ‘‘marking” and recognizing the resultant factors 
expected). One variable was the score on the 13 items of the given factor on the A 
form and the other the 13 on the B Form. The 126 new items were grouped in 
‘parcels’ of three (and sometimes two) of a homogeneity guaranteed by original 
intercorrelations on a 126 x 126 matrix, and a relative factor purity indicated by 
correlations with the separate 16 P. F. factors. This gave 75 ‘parcel’ variables (30 
from the 15 personality factors of the 16 P. F., and 45 parcels of the new 126 items), 
a number readily factored. The saving in thus factoring a 75 x 75 matrix, despite the 
two preliminary special correlation jobs, over the 500 x 500 matrix otherwise neces- 
sary, is considerable and has been evaluated elsewhere ®?. 

The blindly rotated factors agreed well with the original 16 P. F. factoring“ as 
to number and nature, except for some confusion of the factors of neuroticism and 
anxiety, commonly labelled O and C. In view of these relatively modest loadings on 
the two less clear factors, every factor was accordingly estimated by the most exact 
method from Thomson “*) and the correlation of each item in the questionnaire de- 
termined with the factors (a 15 x 500 matrix). The results were used both to evaluate 
the existing 16 P. F’. items and to construct the C Form. It is the first of these which 
is relevant to the present study. By eliminating deadwood from the original 
16 P. F. it enabled us to start out with clearer, unconfused markers of the 15 factors 
for cross validation in the third factorization, and to guide the pulling in of new items 
from the second or B Extension Questionnaire, of 252 items described below, while it 
also supplied factor loadings for every item so that by the final factoring every load- 
ing would have a double check. 

In the second experiment, with 408 subjects, (227 Air Force men and 181 
undergraduates from four Illinois colleges) the markers for the known 16 P. F. factors 
were made up, not as previously, by taking all the factor items from one form to 
make one parcel, but by putting together only those 9-11 items for each factor shown 
by the second factoring above to be most highly loaded. Five “parcels” were made 
up in this case from the above proven items in the existing 16 P. F. to represent each 
factor. For the aim in this third factoring was to test the 16 P. F. structure more 
exactly than in the second factoring and to determine the correlations among simple 
structure factors with a high degree of exactitude. Also it aimed to get such well 
saturated factor estimates for each factor that they could be used to determine the 
loadings of new items, from the second extension questionnaire, with a precision 
comparable to a direct factoring, to permit replacement of any of the 169 existing 
items by any discovered in the extension having higher loadings. 

The new 75 x 75 matrix was centroid-factored and rotated to simple structure 
with great care. Every hyperplane is above the .01 level of significance by Barg- 
mann’s test“), It was gratifying to find that most of the cross loadings among the 
O and C factor (and to some extent the Q:, Q. and Q, factor) parcels encountered in 
the second factoring disappeared in the better parcels of the third factoring. Fifteen 
factors were significant and were clearly identifiable by their markers.2. The C = 
lambda! x lambda matrix, giving cosines among the reference vectors when simple 
structure was reached, is set out in Table 1. It will be seen that the obliquities are 
moderate. A second order factoring of these inter-factor correlations is in press, 


*The unrotated, rotated and transformation matrices for this analysis are deposited with the 
American Documentation Institute, Library of Congress, 
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Extension questionnaire B, of 512 items, was reduced by steps 1 to 4 to 252 
items, which were then correlated, on 408 subjects, as above, with each of the 15 
factors, estimated from the parcel variables and their loadings, by the method in- 
dicated before. 

Whenever existing items in the original 16 P. F. correlated, on both factorings, 
.20 or less with the factor they represented, they were cut out as ‘deadwood’ and re- 
placed by items found to be more highly loaded from Extension Questionnaire B. 
Unfortunately at this point more poor items were found in factors M, C, O and Q, 
than there were items to replace them. So a third Extension Questionnaire, C, was 
made, beginning with 320 and reducing to 200 items deliberately aimed at these 
factors. Thus, by direct correlation with the factors, on a sample of 200 men and 
women undergraduates, sufficient items loaded above 0.2 were found to supplant the 
unsatisfactory items in factors M, C, O and Qs. 

In the following section the resultant structure of the 16 P. F. is illustrated by 
two items from each factor, one from the A form and one from the B form. These 
are neither the highest loaded items in the factorings nor the highest loaded among 
those which survived the ensuing selections of steps 5 through 10. They are selected 
instead to illustrate the degree of constancy of loading of particular items on particu- 
lar factors on two independent factorizations; the range of mean loadings; and the 
psychological nature of the items expressing each factor. 


CONFIRMATION AND DEGREE OF INVARIANCE OF THE SIXTEEN FACTORS 


Each item below is set out under the factor as usually symbolized by letter and 
contingent names. To the right are set out (1) the response—left or right for ((a) or 
(b) or yes or no)—which scores positively on the factor; (2) the loadings in the two 
factor studies (second and third); and (3) the frequency of the positive scoring, 
central and negative scoring responses in 408 subjects. 


Factor A, CYCLOTHYMIA-VS-SCHIZOTHYMIA 


Test Positive 
Form Item Response Loadings Response Frequencies 
1. A_ If the earnings are the same 
I would rather be 
(A) a lawyer 
(B) a freight air pilot 
In a factory I would rather be: 
(A) in charge of mechanical matters 
(B) engaged in interviewing and 
hiring people. 
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Factor B. GENERAL INTELLIGENCE 
(These items were factored apart from the main study). 


Factor C. Eco Srrencru 
Test Positive 
Form Item Response Loadings Response Frequencies 
3. <A I occasionally have realistic dreams 
that disturb my sleep. Rt. 48 46 356 12 40 
4. B_ I sometimes feel compelled to count 
things for no particular purpose. Rt. 36 .22 255 16 


“a 


Factor E. Dominance 
I occasionally tell strangers about the 
things I am interested in and good 
at, without direct questions from 
them. Lt. .20 
I have on occasion torn down a pub- 
lic notice forbidding me what I felt I 
had a perfect right to do. Lt. 24 


Factor F. Suraency-vs-DEsSURGENCY 
I like a job that offers change, var- 
iety and travel, even if it involves 
some dangers. Lt. .30 | 
I would prefer the life of: 
(A) a master printer in a modern 
plant 
(B) an advertising man and pro- 
moter. Rt. .38 47 


Factor G. Super Eco Srrenctru 
I think that good manners and res- 
pect for law are more important than 
excessive freedom. Lt. 36 47 
I admire more a person who: 
(A) is. brilliantly intelligent and 
creative 
(B) has a strong sense of duty to the 
things he believes in. Rt. .28 31 


Factor H. Immunity (or ADVENTUROUS CYCLOTHYMIA) 
I have at least as many friends of the 
opposite sex as of my own sex. Lt. ; 385 
If people in the street, or standing in 
a store, watch me I feel slightly 
embarrassed. Rt. .39 49 


Factor I. Sensitrviry-vs-TouGHNESS 
I would rather spend a free evening: 
(A) with a good book 
(B) working on a project with 
friends. 
In art and music we should 
(A) give popular demand what it 
wants, regardless of quality 
(B) try to raise nme, by giving 
experts a chance to control taste. Rt. .33 


Factor L. ParaNnorp TREND 
If I am quite sure that a person is 
unjust or behaving selfishly I show 
him up, even if it takes some trouble. Lt. .37 
I suspect the honesty of people who 
are more friendly than I would nat- 
urally expect them to be. Lt. .29 


Factor O. Frees ANXIETY 
I feel grouchy and just do not want 
to see people: 
(A) Occasionally 
(B) Rather often. 
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Factor O, Free Anxiety (continued) 


Positive 
Item Response Loadings Response Frequencies 
I am moved almost to tears by some- 
thing upsetting: 
(A) never 
(B) sometimes Rt. 19 21 


Factor Q;. RapIcaLisM-vs-CONSERVATISM 
It would be better if we had more 
strict observance of Sunday, as a day 
to go to church. Rt. .32 
In my work more troubles arise from 
men who:— 
(A) are constantly changing meth- 
ods that are already O. K. 
(B) refuse to employ up-to-date 
methods. Rt. -22 


Factor Q,. Se_Fr SuFFICIENCY 
I like to take an active part in social 
affairs, committee work, etc. Rt. .23 
I get as many ideas from reading a 
book myself as from discussing its 
topics with others. Lt. 


Factor Q;. Witt Controu 
When talking I like:— 
(A) to say things just as they occur 
to me. 
(B) to wait and say them in the most 
exact style possible. Rt. 
However difficult and unpleasant the 
obstacles I always persevere and 
stick to my original intentions. Lt. 58 
Factor Q,y. Tenston (Somatic ANXIETY 
At times of stress or overwork I suffer 
from indigestion or constipation:— 
(A) practically never 
(B) occasionally Rt. 
My nerves are sometimes on edge, so 
that certain sounds, e.g., a screechy 
hinge, are unbearable and “give me 
the shivers’’. Lt. 


SUMMARY 


(1) A multiple factor-scale questionnaire, covering fifteen personality factors 
and the cognate factor of general intelligence, based on an older factor analysis, has 
had its factor structure re-examined, the factor loading of every item determined, 
and items of low validity replaced by new items of improved validity. The process is 
defined as validation and intensification, since the conceptual factor validity of each 
item is determined, and the factor saturation and independence of the sixteen scales 
is intensified. 

(2) Ten canons for multiple factor scale construction are laid down and ex- 
emplified in operations with the Sixteen P. F. Questionnaire. 


(3) The principal innovation is the introduction of ‘“‘parcelled factor analysis” 
in which a much larger number of items than could usually be handled is first group- 
ed, by clustering and correlation with existing factors, into a smaller number of 
homogeneous (but factor impure) “‘parcels” or short, rough scales. The factor struc- 
ture is determined on this relatively small matrix (75 x 75 in this example) and the 
parcels are then ‘‘undone’’ and all constituent items correlated directly with the 
factors estimated in terms of parcels. An ‘“‘extension questionnaire” of items hypoth- 
esized to be highly correlated with the factors is also correlated item by item with 
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this same factor score, whereby weak items in the original questionnaire are replaced. 
This device gives the factor loading of every item in the original test and in the ex- 
tension with every substantial factor, and under comparable conditions, at a small 
fraction of the prohibitive labor for a matrix of order 1000 x 1000. The item loadings 
could be, but were not, corrected for attenuation by unreliability of factor estimate, 
since only relative goodness of items need be accurate in this procedure. 


(4) Two parcelled factorizations, on 400 and 169 items, were carried out on 
independent population samples and with independent computing and blind rotation. 
However the process was iterative in that clearer factor definition was achieved in 
the second through entering with more factor-pure and homogeneous parcels as a 
result of the findings of the first. Both yielded, by existing tests of completion of 
factor extraction, 15 factors (i.e., 16 with intelligence) and, through the marker var- 
iables, they were confirmed to be the same factors in all three factorings, i.e., to be 
the same in both experiments here and the same as named in the original study. 

It may be asked how far the inclusion of more items that are good measures of 
the factors found in the first factorization prejudges the structure of a second factor- 
ing. The writer would answer (1) the insertion of items high in one factor does not 
strengthen pre-existing hyperplanes for the other factors (unless factors are ortho- 
gonal and the items are factor-pure as well as highly loaded). (2) An infinity of rota- 
tion positions are still possible, so if the same is found again it is proof that the 
structure is inherent and that new items adhere to it for this reason, since they are 
not made to adhere for any other reason”). 

(5) A total of 1552 newly constructed items were brought into three extension 
questionnaires. Extension A began with 720, reduced to 450 before final correlation 
on a group of 295 men and women undergraduates. These items were relevant only 
to the initial factor structuring and were actually used for Form C, whereas ex- 
tensions B and C were used here for intensifying Forms A and B of the 16 P. F. Ex- 
tension B began with 512 items, reduced by the first steps to 252 and then correlated 
on a sample of 408 young men and women, half Air Force, half undergraduates. Ex- 
tension C began with 320, reduced to 200 before correlation on 200 men and women 
undergraduates. From the 1552 items, 110 eventually strengthened the original 16 
P. F. (replacing the weakest 110 of the 374 items in A and B forms). 

(6) Further work on the structure, psychological meaning and prediction value 
of the factors, in clinical and other work, is in progress. As to the re-standardization 
of the revised test it may be pointed out that one of the advantages of factor scales 
is that the clinical and occupational profiles, criterion regressions and specification 
equations found for standard scores on the older test continue to apply (with some 
attenuation correction) to the new. The meaning of the present factors in terms of 
second order factors is being determined from Table 1°. It is also an important 
aspect of the meaning of factors to determine whether they persist in different cul- 
tures and for that reason the present confirmation of constancy within a culture is 
being extended by a similar comparison of factorizations on British, French, Italian, 
Indian and Chinese versions of the 16 P. F.“. 
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TRAIT JUDGMENTS FROM PHOTOGRAPHS AS A PROJECTIVE DEVICE! 
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Northwestern University Chicago, Illinois 


PROBLEM 


This paper reports on judgments of character made from photographs, analyzed 
to test the hypotheses of consistent individual differences and stimulus generaliza- 
tion implicit in the use of such tasks as projective devices. The assignment to look 
at the photograph of a person and tell about his character is one of the oldest “ and 
most typical of the projective test formats, even though it has seen little direct ap- 
plication in typical clinical situations, being more used in attitude research: #4). 
This approach to personality and attitude assessment has seemed promising to the 
present writers not only because of an intriguing and incomplete research record, but 
also for a priori reasons. The most important elements in the environment of people 
are other people, and personality deviations and maladjustment are quite frequently 
characterized by biases or misconceptions in the evaluations of these other persons. 
Asking people to characterize other persons in considerable number would seemingly 
be a rather direct way of getting at persistent idiosyncracies, biased perceptions, or 
biased response tendencies. 

One could construct such a test by clipping photographs from magazines. How- 
ever, such tests not only have dubious legal standing when duplicated for general 
use, but also are apt to appear to be just what they are to the respondent and to 
create in him the kind of suspicion of the experimenter’s purpose that makes res- 
ponses hard to interpret. We wanted a test in which it was obvious that these were 
not photographs casually clipped from magazines—a test in which it was plausible 


that we knew the correct answers about these persons’ character and were asking the 
respondents to try and guess what these right answers were. We also wanted a test 
in which we had the photographic subject’s permission to use his photograph in this 
way. These conditions made it necessary for us to collect our own photographs. 


CONSTRUCTION OF THE TEST 


The basic design of the photograph collection was to obtain pictures of men and 
women of four different age levels, teenage (15-18), young (25-35), middle-aged 
(40-55), and old (65+). In addition, for the specific focus on attitudes toward auth- 
ority under which the test was initially prepared, photographs were sought to repre- 
sent a group of weak looking middle-aged men and a group of strong or dominating 
looking middle-aged men. Photographs were obtained from the following sources: 
the junior class of a high school, a group of business women in a residence club, a 
group of firemen, members of a fraternal order, a group of Army officers, men and 
women from old people’s homes, individuals attending a county fair, and men from 
the skid-row area of Chicago. From all of these persons, a legal permission statement 
was obtained and, in the case of minors, their parents’ consent in addition. 

The second author made the contacts with the groups and did the photograph- 
ing. A systematic set of poses was taken, using a constant distance and an electronic 
flash light. Variations in power source reduced the uniformity of quality. However, 
it is believed that the situation and photographic style are more uniform than has 
been achieved in comparable tests. Two forms of the test were prepared, which will 
be designated Form A and Form B. In each there are fifty persons represented, ten to 
a page. Each page has one each of the ten basic categories of persons. 


1This study was supported in part by the United States Air Force under contract No. AF 18(600)- 
170 monitored by the Crew Research Laboratory, Air Force Personnel and Training Research Center, 
Randolph Air Force Base, Randolph Field, Texas. Permission is granted for reproduction, translation, 
publication, use and disposal in whole and in part by or for the United States Government. 
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The instructions for the test, in its free response form, were as follows: 


This is a test to learn how good you are judging what a person is like from 
his picture. In the photo booklet you will find a series of pictures of a wide var- 
iety of people from many walks of life. For each person you will find two pic- 
tures: the first is a profile and the second isa picture of the person posing himself 
in a mirror. In both he was allowed to pose himself as he wished. You will 
notice that everyone is wearing a cape in both poses so that there are no differ- 
ences in clothing. Along with the photographs, information was collected on the 
personality traits of each individual. The accuracy of the judgments you are 
going to make will be scored on the basis of this information . .. Write down 
what you think the person in the picture is like. We are concerned primarily 
with how well you can judge personality make-up. Although you may find it 
difficult to make guesses about some of the people, simply do the best you can. 
You do not have to give any specific number of traits. Just write down as many 
as you think of. Do not spend too much time on any gne picture. Use single 
word descriptions or short phrases if possible. 


In the checklist answer form, the separate answer booklet asked that each person 
pictured be evaluated in terms of thirty trait terms, indicating which terms applied, 
and which definitely did not apply, leaving the other terms unchecked.? 


ANALYSIS AND RESULTS 


The present report deals with internal analyses of the test designed to check on 
basic assumptions of consistent individual differences in response to the photographs 
in general, consistent individual differences in response to specified categories of 
photographed persons, and stimulus generalization along sex and age lines. 


Analysis cf variance among responses. The design of the test booklets into photo- 
graphs of two sexes and four age levels facilitates the application of analysis of var- 
jance statistics to the trait judgments. A series of such analyses have been done, 
utilizing responses to the eight basic types, omitting consideration of the weak and 
strong middle-aged men. These analyses employed four classification criteria: Age 
of photographee, Sex of photographee, Page of test booklet, and Respondent. The 
analyses have been done for the free response form of the test, with each judgment 
rated for favorableness, intermediacy, or unfavorableness on a three point scale. 
While this scoring is only one of the many possible scorings, it probably represents an 
optimal one. The interjudge reliability for this scoring was .95 and .86 for different 
pairs of judges. Judgments for each analysis are based upon a single judge’s ratings, 
to hold judge variance constant for the comparisons involved. To avoid correlation 
between the judge effects of fatigue, respondent stereotyping, and the like, responses 
to a given photo by all respondents were rated together, with the answer booklets 
being shuffled between photos. For Form A, the following populations of respond- 
ents were used: 94 bomber crew personnel, officers and enlisted men, tested at Ran- 
dolph Air Force Base; 33 pre-flight cadets, tested at Lackland Air Force Base; and 
92 women in W. A. F. training at Lackland Air Force Base. On Form B, there were 
40 pre-flight cadets. 

From the point of view of interpretable test scores, the most general kind of a 
question which one might ask of these analyses of variance is about consistent in- 
dividual differences in the tendency to see photographed persons in a generally favor- 
able or unfavorable light. This would turn up as a significant main effect of the Res- 


It will be apparent to those interested in projective test development that the present study ex- 
ploits only a fragment of the potential uses of these test booklets. In addition, the population of per- 
sons tested is inappropriate to many projective test hypotheses. For these reasons, and because of the 
immense amount of effort that goes into the preparation of such materials, it seems possible that others 
would be interested in employing the test booklets or photographs in different research settings. Within 
the limits of present supplies and practicality, the authors would be happy to comply with such 
requests. 





TRAIT JUDGMENTS FROM PHOTOGRAPHS AS A PROJECTIVE DEVICE 217 


pondent classification criterion in the analysis of variance. No such significant effect 
was found for the three male populations. However, for the WAFs, this effect was 
significant at the .005 level. There are thus consistent individual differences among 
the WAFs in their tendency to rate the pictures as a whole favorably or unfavorably, 
irrespective of photo type. The failure to find a significant main effect for the other 
populations is actually encouraging, in that it indicates the potential usefulness of 
considering an individual’s differential pattern of response to different sex-age cate- 
gories. While not relevant to the diagnostic potentialities of the test, it can be noted 
that for none of the populations were the main effects of Sex, Age, or Page significant. 

The next question of psychological interest asks about consistent individual 
differences in pattern of response to Age or Sex categories. This is a statement about 
first order interaction involving Respondents. The Respondent-Sex interaction is 
significant at the .01 level for the Lackland Cadets, Form A, and at the .05 level for 
the WAF's. For these two populations, there is a significant tendency for some per- 
sons to consistently rate men more favorably than women, while other respondents 
rate women consistently more favorably than men. The Respondent-Age interaction 
is significant for only one of the four groups, for the Lackland Cadets, Form A, at the 
.01 level. For this population, respondents have consistently different patterns of 
response to the different age groups. For none of the groups is the Respondent-Page 
interaction significant. While not relevant to the diagnosis of individual differences, 
it can be noted that none of the remaining first order interactions among Sex, Age, 
and Page are significant. 

The Respondent-Sex-Age interaction would indicate that respondents had 
idiosyncratic but consistent patterns of evaluating various age-sex categories. This 
interaction is significant at the .001 level for the Randolph population, and at the 
.01 level for the Lackland Cadets, Form B. To have found this interaction significant 
for all of the populations would have provided optimal encouragement for the utility 
of the test in measuring subtle personality constellations. 

The Respondent-Sex-Page interaction and the Respondent-Age-Page interaction 
are both significant at the .001 level for the Randolph population, and the latter at 
the .01 level for the Lackland Cadets, Form B. These interactions are difficult to 
interpret. Significant for all populations, with p values of .005, .001, .001, and .01, is 
the Sex-Age-Page interaction. This three way classification specifies the individual 
photographees and emphasizes the fact that the pictures had quite marked individ- 
uality, with favorable or unfavorable connotations shared by the respondents as a 
whole. The strength of the individual picture pull accentuates the problem of 
sampling photographees, and using enough photographs in the test, so that the 
sampling of examples achieves stability. All of the evidence from the present study 
indicates that this is apt to be a serious problem. The mean profiles by category for 
Form A and Form B are markedly different, indicating that sets of five pictures in 
any one age-sex category are not enough to average out sampling differences. Once 
evaluative data on the photographs are available, however, it would be possible to 
assemble better matched forms than our present two. 

In general, while the analyses from the various populations are disappointingly 
lacking in parallel findings, they are collectively encouraging for the existence of 
significant, consistent individual differences among respondents. For the WAF 
population, the Respondent main effect and the Respondent-Sex interaction are 
significant. For the Lackland Cadets, Form A, the Respondent-Age and the Res- 
pondent-Sex interactions are both significant. For the other two populations, the 
Respondent-Age-Sex interaction is significant. 


Reliability of subscores. Analysis of variance leaves one rather far from the prac- 
tical task of measuring individual differences and assigning scores. It also lacks the | 
palpability of more descriptive statistics of the correlational form. For these reasons, 
a study of reliability of subscores has been made which in some ways repeats, and 
confirms, the analyses just reported. Basically, each respondent has been assigned 
ten scores, one for each of the ten photographee categories. Each of his scores is 
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based upon his response to the five photographees of that specific age-sex category. 
The analysis of variance type of internal consistency type of reliability coefficient 
has been used®). In addition, the reliability of an over-all score, based on all fifty 
pictures, has been computed. Table 1 presents the results. 


TasBLe 1. RELIABILITY OF SCORES 








Comb. Rand. Rand. Lack. Lack. Lack. Lack. Lack. Lack. 
Factors Men Off. Enl. Cad. Cad. WAFs Cad. Cad. WAFs 
Free Free Free Free Free Free Check Check Check 
A A A B ! A B A 


N 183 66 $- 33 40 


Teen Girls .2e AL = 54 6 
Young Women 21 : .26 By 
Middle Women 29 15 3 .25 
Old Women .29 . .19 
Teen Boys 20 .Of 3s 34 
Young Men a sa ot 06 
Middle Men oe .06 3: .53 
Old Men 06 0% X 26 
Weak Men a mm 16 21 
Strong Men we : 4S .08 .42 
Sig. p <.01 2 36) 32 45) (41) 


30) 
Total Score ay j 69 63 a 77 
Sig. p< .01 (.22) 3! : (.42) (.38) 23 28) 
Total (1/10) (.16) ( (.18) (.15) 





For this analysis, the Randolph population has been separated into two, treat- 
ing officers and enlisted men separately. In addition, some of the respondents of the 
various populations who took the checklist form have been included. For the check- 
list form, a score with a potential range of twelve points was employed. Four relative- 
ly unambiguously favorable traits and eight unfavorable traits were selected, such 
that in the over-all population, the frequency of usage of the unfavorable traits 
equaled that of the favorable. Attention was paid only to the ‘‘applies’’ column of 
checks, inasmuch as some respondents had misunderstood the intention of the 
“definitely does not apply” response. A counting of the frequency of the use of these 
twelve terms, scored as a net unfavorable count, was used as the measure of response 
to each photograph. These were summed for the category scores and the overall 
scores. In addition to the reliability values, Table 1 presents the minimum value for 
the specific category score reliability that would represent statistical significance at 
the .01 level, computed in terms of Hoyt’s®? analysis of variance derivation of the 
reliability formula. The comparable value for the score for all fifty pictures is also 
presented. The significance level is a function of the number of items and the num- 
ber of respondents. 

From Table 1, it can be noted that many of the category scores are, for the 
number of cases involved, not significantly reliable, or internally consistent. Nor 
is the pattern of high and low reliabilities over the categories parallel from popula- 
tion to population. Many of the category reliabilities are, however, impressively 
high, considering that they are based upon but five items. The question can be ask- 
ed, are these specific categories more homogeneous than the over-all set of fifty pic- 
tures? To answer this, a value has been inferred from the total score reliability, such 
as would represent the reliability of a scale equally homogeneous, but only one- 
tenth (or five items) long. The Spearman-Brown prophecy formula has been used 
for this purpose. These values are also indicated in Table 1. In general, the compari- 
sons confirm and specify the general trends indicated by the analysis of variance. 
For example, the WAFs, for whom the Respondent main effect was significant, show 
the highest reliability. For the WAFs, the Respondent interactions were not signi- 
ficant, and of the eight categories involved in the analysis of variance, only four have 
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reliabilities higher than the one-tenth-of-total value. For the other groups, with the 
exception of the Randolph officers, the bulk of the category reliabilities indicate 
higher homogeniety than the composite total, confirming the significant interactions 
involving Respondents in the analysis of variance. 

In general, reliability analysis is encouraging for the possibility of obtaining 
consistent differential response scores, although the small number of items in the 
vategories creates trouble. Potentially, however, this could, in part, be overcome by 
combining the ten categories into fewer, if this did not reduce the category homo- 
geniety. To this possibility, the following analysis is relevant. 


Correlational analysis of stimulus generalization. Whether tests such as this be 
justified in terms of psychoanalytic theory, perception theory, or learning theory, 
assumptions about stimulus equivalence or stimulus generalization are involved. 
The most crucial of these assumptions is that of generalization from symbolically 
represented persons of appropriate age and sex to actual persons, such as parents, 
bosses, and the like. But also important is the assumption of stimulus generalization 
or equivalence among the test stimuli themselves, and according to the categories 
utilized in the scoring procedures. The analysis of variance and the reliability an- 
alysis are appropriate to testing this hypothesis for stimulus equivalence among the 
photographs in a specific category. To some extent, these analyses have justified 
the assumption. However, correlation coefficients among the ten category scores 
would provide a more direct test of the assumptions of age and sex generalization. 
Matrices of such correlation coefficients have been computed for each of the popula- 
tions. One such matrix is shown in Table 2. 


TABLE 2. Carecory CORRELATIONS FOR COMBINED MALE Popu.ation, Form A, 
FREE Response, N = 183. 








Women Men 
Mid Old Teen Young Mid. Old Weak 


Factors Teen Young 


Women 
Teen 
Young .16 
Middle-Aged 18 
Old 15 


Men 
Teen S$ 16 
Young 10 16 ‘ ; 05 
Middle-Aged i .09 a ‘ 08 : 
Old .08 13 . : - .02 : 14 


Weak .07 16 : : A 25 .22 2 
Strong 04 15 P Be 18 0 .10 a .33 





The general expectation from age as a dimension of stimulus similarity would 
be that the adjacent age categories would correlate more highly than those farther 
apart. Thus, old women should correlate more highly with middle-aged women than 
with teenage girls. The correlation tables have been arranged so that within each 
sex, if there is stimulus similarity according to age, the values nearer the diagonal 
should be higher. In addition, one would expect age generalization across sex lines, 
with old women to correlate more highly with old men than with young men. The 
underlined diagonal in the set of correlations between men and women categories 
would thus be expected to show the highest correlations in that set, with decreasing 
values farther from the diagonal. Casual inspection of the correlation matrices indi- 
‘sates many exceptions to these rules. 

From the arrangement of the tables, it is somewhat more difficult to check by 
inspection the assumption of stimulus similarity according to sex. The comparisons 
involved are of this nature: Teenage girls should correlate higher with young women 
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than with young men. These can be checked by comparing a correlation in the intra- 
sex part of the matrix with its two parallel values in the intersex portion of the 
matrix. Following up on such comparisons will reveal many exceptions. t4 

To summarize the outcome of large numbers of such comparisons, sign tests 
have been used, and the outcomes of these have been summarized in Table 3. Three 


TABLE 3, NUMBER OF CONFIRMED STIMULUS GENERALIZATION PREDICTIONS. 








Comb. Rand. Rand. Lack. Lack. Lack Lack. Lack. Lack. 

Men __— OOfff. Enl. Cad. Cad. WAFs Cad. Cad. WAFs 

Factors | Tot. Free Free Free Free Free Free Check Check Check 
| Pred. A A A A B A A B A 





Age within Sex | 21 14 10 14 10 13 ‘ F 19 12 
p value .002 


Age across Sex 24 20 ¢ : 11 : 20 12 
p value .001 

Sex 24 12 18 15 13 19 
p value .O12 003 








types of comparisons have been separated: 1. In age generalization within sez, 
twenty-one pairs of correlations have been compared, in each of which one co- 
efficient represented the correlation between adjacent or more nearly adjacent ages, 
while the other represented the correlation between less adjacent ages. In all com- 
parisons, one age category was involved in both coefficients being compared. These 
types of comparisons were involved: Old men should correlate with middle-aged 
men higher than with young men; old men should correlate higher with middle- 
aged men than with teenage boys; old men should correlate higher with young men 
than with teenage boys, etc. Eight such comparisons were made within each sex, 
and, in addition, five other comparisons were possible by considering the strong 
men as middle-aged, and the weak men as covering middle and old age. (These ex- 
pected loci of high correlation from age generalization are also indicated by italic- 
izing in the correlation matrices.) These twenty-one comparisons were made for 
each of the nine correlation matrices, with results as shown in Table 3. For only one 
of the populations do the proportion of confirmations reach statistical significance. 
For the Lackland Cadets, Form B, Checklist, the comparisons are in the expected 
direction 19 times out of the 21, which has a probability value by a one tailed test of 
less then .002. For this group, there was one failure of prediction within the male 
comparisons, and one within the female. In general, the results do not confirm the 
existence of any stimulus generalization effect making adjacent ages more similar 
than remote age groups. Any slight trends in this direction are judged to be of too 
little magnitude or consistency to be useful in projective test interpretation. 2. Age 
generalization across sex involves twenty-four comparisons of the following sorts: 
Old men should correlate higher with old women than with middle-aged women; old 
men should correlate higher with old women than with young women; old men 
should correlate higher with old women than with teen-age girls; etc. In these com- 
parisons, only same-age pairs have been used as the basis of comparison. That is, 
comparisons of this type have not been used: Old men should correlate higher with 
middle-aged women than with young women. The number of correct predictions out 
of the twenty-four for each of the nine matrices is shown in Table 3. Of these, three 
have a score of 20 out of 24, which is significant at the .001 level. For two of these, 
there is overlap in the respondent population, in that the Combined Men Form A, 
Free Response, includes the Lackland Cadets, Form A, Free Response. Once again, 
the Lackland Cadets, Form B, Checklist shows significant age generalization. But 
overall, the degree of confirmation is far less than that expected, and the effect seems 
totally missing for a number of the population. 3. Sex generalization is tested by 
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twenty-four comparisons of the sort: Old women should correlate higher with middle- 
aged women than with middle-aged men. In all cases, age is held constant between 
the two different sex categories involved. For the WAF population in both checklist 
and free response form, sex generalization reaches statistical significance. To some 
significant extent, for the WAFs, responses to two categories of the same sex are 
more similar than for two categories differing in sex. For the male populations, sex 
seems to provide no source of stimulus similarity, to judge from response similarities. 
This finding is so out of line with the common assumptions of psychology as to be 
unacceptable. The present writers believe, however, that the present study provides 
a fair test, and can only hope that others will be moved to replicate the experiment 
with different stimulus materials, different populations, and different conditions of 
administration.® 


SUMMARY 


This study reports on several analyses of the responses of various groups of Air 
Force personnel to a test requiring that fifty persons presented in photographs be 
described in personality terms. Both checklist and free response forms were employ- 
ed. Responses were scored in terms of degree of favorableness. Analyses of variance 
using the classification criteria of Age of photographee, Sex of photographee, Page of 
test booklet, and Respondent, while inconsistent in outcome, support the hypothesis 
of consistent individual differences in response to specific Age and Sex categories of 
photographs. A thorough reliability study in general supports the hypothesis that 
responses to specific Age and Sex categories are more homogeneous than are res- 
ponses to the fifty photographs as a whole. However, intercorrelation of scores from 
the ten specific age-sex categories provides inconsistent support for age as a dimen- 
sion of stimulus similarity. Sex is a significant source of stimulus similarity for the 
women respondents, but not for any of the male groups. These findings are thought 


to be quite out of keeping with the common assumptions of psychologists in utilizing 
such tests. 
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3As has been noted above, the reliabilities of the category scores were quite uneven, and it seemed 
possible that the failure of the correlations to show the expected pattern might be due to this. For this 
reason, the two matrices based upon the largest number of respondents, i.e., Combined Men, Form A, 
Free Response, and Lackland Cadets, Form A, Checklist, were corrected for attenuation, and the 
predictions rechecked. This correction did not improve the level of prediction, and does not serve to 
explain the failure. 
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One of the most important developments arising in clinical psychology in 1955 
stems from the work of Dr. P. E. Meehl. Writing with L. J. Cronbach ?, with A. 
Rosen “?, and also in his presidential address to the Midwestern Psychological Asso- 
ciation *!), Meehl has explicated three important problems in what may be termed 
objective psychodiagnostics. The implications these three papers have for workers 
in this area and for the studies appearing this year deserve examination and com- 
ment. Each paper is significant in its own right but taken together they go a long 
way towards extending the ideas set forth in his earlier volume ®®?) and outlining the 
dramatic implications for current psychodiagnostics of the theory of statistical de- 
cisions “*?, 


IMPORTANCE OF BasE RATES 


In the last few years most clinicians seem to have come to understand that mere 
statistical stability of a difference between measures of central tendency of scores 
from samples of patients and normals provides little practical help in making de- 
cisions about a particular patient with a given score. Proper emphasis is now being 
placed more upon the actual amount of overlap of the respective distributions by 
test developers and publishers. In a closely reasoned paper, Meehl and Rosen “? set 
forth the sobering fact that even excellent reduction in distributional overlap (al- 
though necessary) is not sufficient to assure accuracy in decisions about a given case. 
The antecedent probability, or base rate, of a given characteristic in the general run 
of clinical cases being seen is of crucial importance in determining the efficiency of a 
cutting point on any score distribution. Since these values will fluctuate from install- 
ation to installation, and undoubtedly from time to time in any given clinical agency, 
the test producer cannot provide this additional information for the individual clin- 
ician in the use of a diagnostic device. The clinician must be prepared, both in aware- 
ness of the problem and sophistication about the solutions required, to establish 
these values in his own setting. 

For example, Churchill and Crandall“ report an over-all accuracy of about 
60% in identifying cases who will seek counseling, using scores on the Rotter In- 
complete Sentence Blank (ISB). Fifty-nine per cent of the female and sixty-one per 
cent of the male college students who do or do not seek help at a counseling center 
within two years after admission to school were correctly labeled by using an a priori 
cutting score on the ISB administered routinely to them as entering freshmen. In 
this analysis they did not take into account what Meehl and Rosen call base-rate 
asymmetry. When the small proportions of the counseled students of each sex in the 
total student population (an over-all rate of 26°, ) are taken into account, it is found 
that 44 of the 87 women labeled potential counselees by this cutting score and 49 
of the 62 men do not turn out to be counselees by their criterion. Other clinicians 
who may try to apply their findings to groups where the rates for potential counselees 
are even lower would have to expect even more disappointing results. 

Similarly, in a study by Davidson et al“ the usefulness of a combination of 
the Sheldon somatotypes and a single cutting score on MMPI scales as a screening 
procedure in identifying breakdown in psychological health was evaluated. Equal- 
sized samples of 100 cases of psychologically ill and of normal student volunteers at 
Oxford University were used and conclusions were drawn as if each condition were 
equally probable in the student population. Adequate attention to the base rates in 
this study would have led to less sanguine conclusions on the potentialities of their 
screening procedure. Thus, although they report only 19° incorrect placements in 
the total group studied by using this very simple configural technique in which all 
MMPI scores were treated the same, allowance was not made for the fact that break- 
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downs of this severity occur in no more than 5°; of the students, as reported by one 
of the authors®®, over four times as many normals would be labeled “lacking in 
psychological health” as actual patients (17 to 4 in every hundred students). Con- 
sidering a population the size of the Oxford student body (over 6000), more than a 
thousand healthy students would be studied as possibly psychologically ill in order 
to identify at most 240 needing psychological help. Davidson et al@° take up the 
use of subsequent psychiatric interviewing after initial screening in what Meehl and 
Rosen discuss as the ‘‘successive-hurdles approach.”’ Although the amount of shrink- 
age on cross-validational samples was not directly considered, appropriate use of the 
interview material would reduce the per cent of misses to 12°, in each group. The 
expense of interviewing, even ‘“‘on one occasion for 50-80 minutes,” as large a group 
of false positives as noted above seems exorbitant. 


ProGNOstTIc METHODS 


Obviously many of the same problems occur in the utilization of prognostic 
indicators. Meehl and Rosen discuss briefly the difficulties in applying some of the 
predictive signs from psychological tests or case history material in the selection, or 
more precisely, in the placement of patients for treatment. Even highly successful 
prognostic indicators are not worth scoring up if almost all cases are accepted for 
general treatment. Some of the research this year suggests that many of our accepted 
prognostic instruments are far from this level of success “*: *% 3°), 

So far, very little work has been done on the differential assignment of cases to 
particular therapies or therapists, and little additional material appeared this year 
directed at this problem. The research on prognostic indicators this year has been 
frequently beset with criterion problems. Rather than objective and independent 
criteria of improvement or lack of improvement, studies on this problem have dealt 
with the easier datum of length of therapy “* *), For many reasons this problem is a 
legitimate research question but it cannot be equated with success in treatment © *5?, 
When combined with the judgment of the therapist that treatment termination was 
premature, however, it can obviously be used as one index of failure to achieve treat- 
ment goals. Examples of this sort of research appear in the papers of Calden et al? 
and Moran et al@* on tuberculous patients. 

Meehl and Rosen go on to discuss the effects of the magnitude of the base rate 
on the utility of devising special techniques for diagnosis or prediction; nearly uni- 
versal or very rare events constitute the most difficult criteria for clinicians. Under 
such circumstances, the authors are able to show, adopting a new cutting score that 
increases the percentage of valid positive decisions in a patient group at the expense 
of only a small change in per cent of false positives in the normals may actually result 
in an increase in the number of erroneous diagnostic decisions. Such a result would 
undoubtedly be produced in the material reported by Powers and Hamlin®”? if real 
base rates were employed instead of small, equal-sized samples of normals and var- 
ious diagnostic groups. In their article they report the percentage of hits with two 
different cutting scores. Since the cost would only be ten per cent of the normals to 
get an increase of 40-50°% of psychotics, it would be tempting to change the cutting 
score unless the relative frequency of normals and psychotics is borne in mind. The 
solution to this problem may be to concentrate upon certain subpopulations where 
the base rate is substantially lower or higher than in the total clinical population 
(approaching an optimal 50% value). 

Caution must be exercised in applying some of the strictures of Meehl and Rosen 
about the utility of various psychological techniques in diagnosis or selection for 
treatment since some of these procedures have uses other than for these problems 
and what may prove to be a waste of time in improving diagnosis or case selection 
may prove to be well worth the time in research as measures of change of status, 
control variables for matching groups before differential treatment, or as a means of 
improving upon or clarifying the validity of certain constructs (see below). In each 
separate issue, however, these same problems will arise and must be handled directly. 
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DECISION MAKING PROCESSES 


These questions lead directly into a consideration of problems of the strategy of 
clinical decision making. Questions arise about whether a certain diagnostic technique 
should be used routinely on all cases or whether it should be reserved (because of base 
rate level, expense, danger, or whatever) to patients screened out by other workers or 
other tests. Such problems are being decided every day on a haphazard, intuitive 
basis in medical and psychiatric services everywhere. The need for making these 
problems explicit and bringing research results to bear on practical decisions of 
clinic management and organization is compelling. In the membership of the con- 
temporary psychiatric team, the psychologist is usually the professional person 
qualified to carry out such studies which have great promise for the improvement of 
not only his own professional activities but the functioning of the whole team. 

The main requirement is agreement upon a utility function®*. Ageney func- 
tions, division of labor implied in differences in professional training and roles, as 
well as financial considerations enter into this cost function. In addition, the patient’s 
health is one of the most important sources of costs or risks and these considerations 
must also be spelled out. It is impressive to note how much of the thought on these 
problems in clinical practice is subsumed under our diagnostic system. A summary 
of descriptive patterning of observations, of etiological assumptions, of therapeutic 
choices and of prognostic expectations is provided by a nosological system. Within 
this conceptual framework are specified the alternatives for choice of action, many of 
the loss functions for certain wrong decisions, and some of the costs of these actions 
as well in terms of the patient’s health. 


Construct VALIDITY 

With the central importance placed in clinical practice on the diagnostic system 
and the obvious role it will play in objective psychodiagnostics, it is refreshing to 
have available the second of the papers alluded to above, by Cronbach and Meeh|“?, 
on construct validity. Expanding upon the material in the Technical Recommenda- 
tions supplement “*’ on several forms of validity, they describe the logical implica- 
tions of many of the constructs used in clinical psychology and the need for greater 
sophistication in dealing with many test validation problems. The validation of some 
test constructs does not differ appreciably from theory testing since the meaning of a 
construct, and hence its ties with direct observations, depends upon its place in a 
whole set of interrelated propositions, termed by Cronbach and Meehl a nomological 
network. For such constructs, the validity cannot usually be summarized by means 
of a coefficient of relationship with any single, universally-agreed-upon character- 
istic or observation. 

One of the looser but most widely known nomological nets is the psychiatric 
nosological system. The constructs within this system have served as criteria for a 
large segment of psychological test development. As in the past, some of the research 
this year has been devoted directly to this system itself, while a great deal more to 
the relationships of test measures to various sub-systems of this theory structure. 
An example from research this year of a study that combines both problems is that 
of Foulds"*. Working on a small sample of patients, he scaled the degree of agree- 
ment between diagnostic judgments from two psychiatrists on a six-point scale to 
show the dependability of the procedures. He also compared the over-all agreement 
of psychiatric diagnoses made by a psychologist (from a test battery highly loaded 
with intelligence tests) with the final composite diagnostic conclusion on the cases. 
In addition to the elaborate scale of diagnostic similarity, he introduced two excellent 
refinements: a separate analysis of agreement on diagnosis by having a psychologist 
go over the test data without having seen the case, and a control over knowledge of 
base rates by having a psychologist and a psychiatrist each make a ‘‘diagnostic 
guess” for each case from knowledge of age, sex, and fact of referral to the hospital. 
The psychologist appeared to agree more with the final criterion diagnosis on the 
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cases than the two psychiatrists did with each other, but the statistical stability of 
this difference was not tested. Just how the psychologist went about this diagnosing 
was not elaborated upon either, which raises the question dealt with below about the 
private, uncommunicated processes of the clinician. 

By way of contrast to work on the existing nosological nets, this year Leary and 
Coffey.“ presented an outline of an ambitious revision of the nosological system it- 
self, attempting to encompass many behavioral phenomena usually considered to lie 
within the normal range as well. Although these writers gave evidence of awareness 
of some of the logical and methodological problems that Cronbach and Meehl dis- 
cuss, they are not entirely clear about the way a given construct must be introduced 
and supported within a system such as they propose. Their work can be clarified and 
strengthened by proper attention to some of these considerations. 


CLINICAL JUDGMENTS 


In his presidential address“, Meehl raised frankly the issue whether the re- 
lationships between test measurements and descripto-predictive statements about 
the test subject are best mediated through the inductive-deductive inferential system 
of a clinician or through ‘‘explicit rules set forth in (some) cookbook.’’ He quoted 
largely from the unpublished work of Halbower“”?, in which descriptions of four fre- 
quent MMPI types were devised and tested successfully against clinical judgments 
from profiles alone. Meehl makes an excellent case for the method based on a person- 
ological atlas, not only in terms of its obvious dependability and economy, but also 
in terms of its direct relevance to the problems of the agency, its capitalization upon 
stable base rates of the various characteristics in the population studied, its ability 
to discriminate patients, and its precise testability and ease of revision for improve- 
ment of validity. 

No other research on the personality cookbook side of the question became 
available in 1955, but several studies continued to appear on the analysis of validity 
or other features of various professional activities of the clinician. Cline“ con- 
centrated on how several personal characteristics of judges affected judgments, in- 
cluding psychological test data on a large sub-group of them. Bendig®? dealt with 
the amount of their psychological training; and Bieri, Blacharsky, and Reid“? 
studied the Rotter ISB results on judges to determine the effect of different levels 
of personal adjustment on predictive accuracy. 

Giedt"® took up a related problem in his analysis of the clinical judging be- 
havior of psychiatrists, social workers and psychologists when presented with var- 
ious clinical material in closer and closer approximation to the actual clinical inter- 
view. This method promises a great deal ultimately in showing the relative useful- 
ness of various sources of information as well as the degree of fidelity required in 
recording original observations. It also has value in a detailed analysis of the in- 
ferential behavior of clinicians. Although he limited his attention to predictions or 
judgments about the sentence completion behavior of the subjects, the method 
should be applied to a number of important behavior situations. 


DESIGN PROBLEMS IN CLINICAL RESEARCH 


In addition to the interest in understanding the clinician and his professional 
behavior and in controlling or minimizing his errors of judgment, research in clinical 
psychology this year shows evidence of greater concern with design in research. In 
unrelated problems, Williams) and Epstein“? developed particularly well-suited 
designs for the purposes of each study. Also, the power of a control group in evaluat- 
ing the effects of psychotherapy was demonstrated in several studies. Although a 
careful matching study is yet to appear, Barron and Leary“? were able to study a 
group of patients who had to remain on a waiting list before psychotherapy could be 
started. Although this untreated group was not identical in initial MMPI scores to 
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the groups treated either by group or individual psychotherapy, no stable differences 
could be found between the changes from either treatment or waiting. Since the ob- 
served changes were statistically stable on many of the MMPI scales, these shifts 
represent the effects either from being evaluated and promised help or from spon- 
taneous improvement that was unrelated to the agency’s efforts. Although this last 
question could not be answered in this design, it is clear that without the baseline 
control of the waiting-list group, the corresponding changes in the treated groups 
would have been ascribed to the treatment going on concurrently. At least, this is 
the way such information has been interpreted in the recent past. 


A further note on the need for careful controls (and comprehensive observations) 
on cases in treatment comes from Rosalind Dymond“) who reported on a small 
group of cases (‘‘attriters’’) who were on a waiting list for therapy and who refused 
to start when called. Although initially as disturbed as others seeking help, these 
“spontaneous remissions” sorted the Q-technique materials at the end of the waiting 
period in a way indistinguishable from those clients who went through therapy suc- 
cessfully. The TAT protocols from these attriters did not follow the Q-sort pattern, 
however, so the picture of recovery was not entirely consistent. The fact that this 
phenomenon of spontaneous remission can occur and that it developed in over a 
third of the waiting list controls for this particular study indicates the importance 
this complication can assume in research designs on psychotherapy. 


Clinical research is still not free of the problem of inappropriate control groups 
being selected and confounding the results. Freeman and Grayson“) attempted to 
evaluate the child-rearing attitudes and practices of mothers of schizophrenic pat- 
ients ina VA hospital. Besides the difficulty in a study like this of having the child- 
ren under consideration already young adults at the time of the interviews with the 
mothers, there is a need to control for the effects of having a child in medical (and 
psychiatric) difficulty at the time of the interview as well as for the general effects 
of differences in cultural and educational levels of women who raise children to 
maturity. Although they tried to locate mothers who had sons within the age range 
and none of whom had developed schizophrenia, they contacted their control mothers 
through college classes rather than VA installations, and unfortunately could not 
report on any conditional data that could be used to check the magnitude of the 
biases since they did not collect such information in order to guarantee the anony- 
mity of their respondents. 


ANALYsIS OF RESEARCH PUBLICATIONS 


The foregoing material was selected for special consideration in the area of 
clinical research in 1955. The methods of Schofield °* in analyzing the areas of clin- 
ical research were applied to the same journals this year and the tabulation of re- 
search topics covered during this period was made. It was found that the relative 
stability in the distribution of research energies found over the last several years? 
did not hold up. Table 1 shows the relative frequencies and rank order of topics this 
year. Table 1 also shows the corresponding frequencies and ranks for 1954; the rank 
order of these two years was only +.657 (see Table 2). This value constitutes a drop 
of about twenty points from the levels reported over the last several years. Al- 
though there was an increase in 1955 of nearly 100 articles in these reference journals 
over the number published in 1954, the proportion of these that met the criteria as 
clinical research studies for inclusion in this tabulation was about the same (52°) as 
last year. The shift in research emphasis shown in Table 1 is not merely an increase 
in the relative frequency of articles that were based upon empirical data, but a de- 
cided change in the areas that psychologists and psychological editors deem worthy 
of research time and publication space. These trends can be evaluated better from a 
perspective of more than one year, however, and further comment on these findings is 
reserved for a later time. 
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TABLE 1. DisrriputTion or 327 Researcu Stupres Reportep IN Srx SELECTED JOURNALS IN 1955 
BY AREAS OF RESEARCH REPRESENTED, WITH COMPARATIVE Data FOR 1954. 








No. of % of % of 1954 Rank 
Studies Total Total 1954 


Inte: ‘rtest relationships 48 14.7 16 
Normative study (projective techniques) é 14.4 1 
Validity (projective techniques) : 11.6 24 
Normative study (personality) P 8.3 
Normative study (intelligence) 25 7.6 
Normative study (structured personality tests) ; 6.1 
Validity (structured personality tests) ¢ 5.8 
Experimental studies of anxiety ) 4.9 
Physiological studies f 6 
Objective evaluation of therapy 
Mona -nd (prognostic indicators) 

Validity of psychiatric diagnosis 
Validity (W-B diagnostic patterns) 
Analysis of recorded interviews 
New tests (projective) 
Test standardization 
Abbreviated intelligence tests 
New tests (intelligence) 
Detection of malingering 
Miscellaneous 
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TaBLe 2. RANK-ORDER CORRELATIONS BETWEEN CONSECUTIVE YEARS FOR THE FREQUENCIES OF 
RESEARCHES IN CERTAIN ARBAS OF CLINICAL PSYCHOLOGY 


1950-51 1951-52 1952-53 1953-54 1954-55 











.838 .821 .778 .657 





SUMMARY 


Some of the major implications of the developing area of objective psycho- 
diagnostics were selected for special consideration and for the bearing they have for 
some of the research papers appearing in 1955. In addition, some problems of re- 
search design were considered and exemplified from publications appearing during 
this period. Finally, a tabulation of amount of research appearing in various problem 
areas was presented from a standard sample of psychological journals, and a com- 
parison made with previous tabulations for prior years. 
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SPEECH BEHAVIOR AND EGOCENTRICITY! 
MARION WHITE MC PHERSON 


The Neuropsychiatric Clinic of St. Louis 


PROBLEM 


In connection with an investigation into the consistency of results obtained from 
various projective measures, 20 college students were asked to record their responses 
to the Thematic Apperception Test on a tape recorder. While transcribing their 
stories the author noted the disorganized speech that is so common in projective 
protocols. Repetition of words, incomplete phrases, and inappropriate terms were 
striking in frequency. Interest was aroused in the factors that stimulated these ir- 
regularities but before this curiosity could be satisfied it was necessary to identify 
them. Inspection of the literature made it apparent that “slips of the tongue’”’ have 
not been adequately defined. Freud“? referred to an interruption in speech as a 
lapsus linguae. This label connotes physiological and anatomical factors but Freud 
discussed various psychological processes of which censoring was identified as one 
of the most important. In fact, a common connotation of ‘‘slip of the tongue’’ is 
‘saying something that was not intended’. The role of censoring would suggest that 
the term “‘slip of the tongue”’ is misleading and a term with a connotation of psycho- 
logical activity would be more apt. Even in the case of neuromuscular disturbances, 
“slip of the tongue”’ appears to be a misnomer because any or all of the organs used 
in speaking may be involved. 


The stories obtained in the present study suggested many failures in censoring 
but there were words and phrases in which the deviations from smooth, well inte- 
grated speech were not the apparent product of inadequate censoring. They might 
have been the result of illogical thinking, perceptual distortion, or changes in per- 
ception as well as irregularities in the psychological process of speaking. An excerpt 
from one protocol exemplifies the confusion as to the nature of a ‘‘slip of the tongue.” 

... He might also be studying the mechanism in a on a violin, but I think it’s 
a more bitter look he has toward it than one of— It’s a kinda lost look too. I 


don’t think it’s one that of generosity and with respect to the violin—that’s 
about all I can think about this one. 


The literature offered no assistance in the problem of distinguishing among per- 
ceptual, thinking, and speaking irregularities. The author was therefore faced with 
the necessity of developing a communicable method of isolating and characterizing 
deviations in speaking that were diverse in content, grammatical function, and in 
the number of words involved. The purpose of this paper is to suggest a method of 
identifying and characterizing irregularities in speaking. 


PROCEDURE OF CLASSIFYING DATA 


A preliminary tabulation was made of those irregularities in which inferences 
about antecedent psychological processes were obvious. This was an attempt to 
identify some deviations so as to offer clues to the more perplexing ones and to the 
problem of separating interferences in motor speech from those in other responses 
such as perception and association. This preliminary and relatively easily accomp- 
lished task made apparent four types of irregularities that occurred with decreasing 
frequency: 


I. Saying something and repeating it either exactly or modifying it in 
substitute form. E£.g., ““The bed on which the young lady is laying, is lying is in 
the background.” 

II. Saying something that leads the audience to expect an elaboration 


and not meeting this expectation. E.g., “I can’t seem to make head nor tail out 
of this darned—”’ 


1The data for this study were collected when the author was Assistant Professor of Psychology, 
Department of Psychology, College of Liberal Arts, Wayne University, Detroit, Michigan. Appre- 
ciation is expressed to Shirley Terreberry and John Derr for their contribution to this study. 
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III. Using an inappropriate word or phrase. E.g., ““The young man has 
become so traumatic that he doesn’t know what he is doing.” 

IV. Omitting a word essential for ready intelligibility. 2.g., “The individ- 
ual’s fears of deviating, the way of doing things have kept him from many 
things.” 


These irregularities suggested four variations from the customary amount of 
attention to the audience: first, overattention as evidenced by the repetition or modi- 
fication of what was said; second, neglect indicated by a failure to elaborate words 
that had been introduced; third, lack of attention sufficiently extensive to violate 
ordinary word usage; and, fourth, failure to inform the audience of components im- 
portant for easy comprehension of what was said. These deviations are suggestive 
of an increasing egocentricity. In the first type the subject is unduly conscious of his 
audience, in the others he is failing to meet his listener’s expectations. This qualita- 
tive impression of degree of egocentricity was supported by the frequency data 
which revealed a progressive reduction from Type I through Type IV. It is rational 
to infer that in speaking as in most psychological responses the more frequent de- 
viations are less serious than the more unusual ones. 

This indication of egocentricity made it possible to formulate criteria which 
differentiate more precisely between irregularities in speaking and other psychologi- 
cal responses and among the various types of irregularities. In spite of these gains 
decisions were sometimes difficult. This was occasioned by the fact that only verbal- 
izations, merely a portion of the total psychological behavior, were available. This 
lack of information about other variables of the behavior sequence makes it ap- 
parent that inferences about the antecedents and concomitants of speaking must be 
tentative. Appended to this article are the principles that were formulated to judge 
the presence of an irregularity in speaking and the methods of classifying those that 
were not immediately obvious as one of the four types. 

After the formulation of these principles all stories were inspected and all irregu- 
larities in speaking identified and classified. The results of this count are presented 
in table 1. Since these data are based on a small n, an estimation of the reliability 


TaBLe 1. Tue Per CENT OF THE Four TyPEs OF INTERRUPTIONS 


Interruptions Author’s Data Student’s Data 





Type I | 63% 
Type II 17% 
Type III 14% 
Type IV 6% 





of the sample appeared to be in order. In order to obtain this a student assistant 
recorded TAT stories from ten comparable subjects. The writer identified the ir- 
regularities which these subjects made on the seventh and thirteenth cards, the two 
on which the original group made the lowest and the highest number of errors. The 
results of this tally are also included in table 1. All irregularities produced by the 
original group were also examined for length in terms of the number of syllables 
which each contained. Words were syllabized according to Webster’s Collegiate 
Dictionary “. In Type IV irregularities the briefest possible word was inferred. 

In order to describe the type of referent on which the speakers made irregular- 
ities, each deviation was classified in terms of the specificity of its referent. Empirical 
inspection revealed that the words could be classified into three categories: 

R1i—auxiliary words or those which are grammatically necessary but do 
not in themselves identify a particular or unique characteristic. E.g., there, 
part of, that, and it should be. 

R2—words that specify unique infra-human acts, conditions, or things, 

E.g., old, the storm is tropical, animal, rocks, two, and waits. 

R3—words that refer to human beings, anatomical concepts, or roles in life. 

E.g., mother, stomach, his, I, and doctor. 
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R2 and R83 interruptions that also contained R1 words were counted simply as R2 
or R3. Irregularities that involved both R2 and R3 scores were tabulated as combina- 
tions of these. The interruptions were also inspected for instances in which the re- 
ferents in the story and in the irregularities were directly opposed. 

The deviations varied not only in egocentricity, length, and specificity of re- 
ferent, but in their phonetic similarity to the preferred speech. In order to quantify 
this similarity all irregularities were judged as either phonetically the same as the 
story (P1), similar in at least half of the syllables or as having the same root form 
(P2), or unlike in more than half of the syllables (P3). All Type IV irregularities 
were automatically scored as P3. 

To illustrate the application of these concepts two stories are presented with the 
various scores indicated. Those parts of the speech which are considered to be de- 
viations are italicized. The first number in parentheses refers to the type of inter- 
ruption, the second to the number of syllables in the deviation, the third to the re- 
ferent score, and the fourth to the phonetic similarity score. 

12: I’m looking at looking at (I, 3, R2, P1) picture number twelve. The contents of the 
picture zs (IIT, 1, R1, P3) as follows: there’s i boy lying on a cot of some type or might be a sofa 
or a bed. He (I, i R3, P3) his head is resting on a pillow. His eyes are closed. His hands is 
(iid, 1, Ri, I 3) resting over his — . il region. Over him is standing an older man. The older 
man’s face isn’t visible, that is it’s (I, 7, R2, P2) only partially visible. He’s holding a hand, one 
of (I, 3, R3, P3) his right hand over ove - (I, 2 2, R1, P1) and above the face of the boy. His (I, 1, R3, 
P3) this man’s right leg is resting on the sofa or couch. This picture reminds me of the of (I, 1, 
R1, Pl) what I’ve read about hypnotism. The older fellow probably has hypnotized the younger 
boys (III, 1, R3, P3) and is is (I, 1, Rl, Pl) making the magical (I, 3, R2, P1) these so-called 
magical passes over the boy’s head. What shall precede? Well, let’s see—the so-called hypnotist 
will probably give the boy some post-hypnotic suggestions, then awaken him and these suggestions 
will be carried out by the boy. 

14: This is a boy (I, 1, R3, P3) a man who might be contemplating suicide from a high 
story window and he’s (II, 1, R3, P3) before he steps out of the window he stops ¢éo to (I, 1, R1, 
P1) think back over the way years what (III, 1, RI, P3) driven him to want to do this to himself. 
It seems that he has lost out on something that was ve ry important to him. He is (II, 2, R3, P3) 
many things have gone wrong in his life. He’s been unsuccessful or cheated in a business or is no 


fair isn’t didn’t turn out the way intended to (II, 12, R2, P3) all his hopes and aims and ambitions 
seem to — (IV, 1, R1, P3) been washed away. 


The above analyses rest on TAT stories told as a class assignment by all the 
males enrolled in two semesters of an undergraduate course in personality measure- 
ment. At the time of the testing the students had not had a formal introduction to 
projective methodology but were undoubtedly familiar with the procedures in a 
general way. The sex restriction was imposed because time did not permit the an- 
alysis of data from all members of the class and this group provided both a manage- 
able n and avoided contamination of the data by sex differences. 

The twenty subjects were given the series of twenty cards according to the 
procedure recommended by Murray“. This bank of 400 stories was reduced to 391 
by occasional rejections of a card, by the dropping of the voice to an inaudible level, 
and in two instances, by failure of the recording apparatus. 

REsULTS 

A total of 1380 irregularities were identified in the 391 stories. As previously 
noted, the distribution of the types of deviations is presented in Table 1. All sub- 
jects contributed irregularities and the overall rate was .31 interruptions per line of 
typescript. Individual rates varied from .17 to .58. The most frequent irregularity 
was the exact repetition of words or phrases without intervening words. There were 
a total of 406 of these, 29% of all deviations, or 53% of Type I. 

The per cent of total interruptions for the varying number of syllables is pre- 
sented in table 2. The difference between the frequency of interruptions one syllable 
in length and that of any other length is sufficiently gross to support the statement 
that irregularities are most apt to be only one syllable. 

The categorizing of referents in the 1380 deviations revealed 54% in R1, 19% 
in R2, 21% in R3, and 6% in R2 and R3 combined. The difference between the first 
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Tasie 2. Per Cent oF ToTat INTERRUPTIONS FOR EAcH NUMBER OF SYLLABLES 


Number of | Number of Per Cent of 
Syllables | Interruptions Interruptions 








56 


OMI Ore COD 





14 or more 





percentage and any subsequent one is again sufficiently large to permit the statement 
that the most frequent word or phrase on which there is an irregularity has an in- 
definite referent. It is interesting to note that human referents involve such a small 
percentage of the total. Only 44 or 3% of the total number of irregularities were 
instances in which the content of the deviations and the preferred speech were 
directly opposed. 

The judgments of phonetic similarity revealed that 39°% were rated identical 
(P1), 20% similar in at least half of the syllables or with the same root form (P2), 
and 42% as unlike in more than half of the syllables. Since the combined tallies for 
P1 and P2 involve 59% of the total number of irregularities, some phonetic similarity 
between the deviate and preferred speech is the most common occurrence. 

It thus appears that in this group of subjects speaking under these conditions, a 
deviation in speaking is most apt to be a modification or repetition of what has been 
said, to be one syllable in length, to involve an indefinite referent, and to be phonet- 
ically similar to the story. 


DISCUSSION 


The propaedeutic nature of this investigation makes separation into the custom- 
ary divisions encountered in a scientific report difficult. Much of the discussion is an 
integral part of the methodology and has been presented with the development of 
the methodology. The problems of sampling are reduced in this investigation by its 
current interest in the specification and isolation of irregularities rather than in a 
determination of their distribution in the population at large. Although the present 
observational technique does not permit prediction to non-observational situations, 
it is advantageous for the present purpose in as much as it may be presumed to max- 
imize the data. The necessity of speaking into a recording machine and fulfilling an 
assignment in the presence of an instructor most probably created a stress situation 
and thereby increased the irregularities in speaking. 

The principles utilized for the specification of these deviations have no justifica- 
tion beyond their parsimonious character and face validity. They are the product of 
necessity in a discipline not sufficiently well developed at the present time to have a 
technique adequate for some of its subject matter, in this instance, the reliable ob- 
servation of covert responses. 

As previously noted the concept “slip of the tongue” appears to involve an in- 
creasing neglect of the audience or an increasing egocentricity. The present proced- 
ure permits the isolation of various stages in this egocentricity continuum but does 
not approach the problem of equality or inequality of the distance among the steps. 
In fact there is some indication that the steps themselves need additional refinement. 
There is a disruption in integrated, ongoing behavior in all of them. Type I is unique 
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in its elaborative or repetitive component and Type IV by its omission of speech. 
Type IV may in some cases involve merely a dropping of the voice to a subvocal level 
rather than an absolute failure to verbalize. Whatever its exact nature it reflects 
an irregularity in observable, ongoing behavior. Types II and III are similar in the 
sense of involving speech that is not appropriate within the context. Further investi- 
gation might well reveal that they are not psychologically discrete. The present 
differences in frequencies are the smallest between these two types. 

The present study identifies interruptions in speaking behavior and as such sug- 
gests emotional involvement at the time the deviations occurred. Their identification 
does not clarify their causes. The high frequency of brief interruptions, of words 
with referential non-specificity, and of phonetic similarity imply that the disturbing 
agents will not be discerned from what was said. The present data point to inade- 
quacies in popular “dynamic” interpretations of “slips of the tongue.” To illustrate, 
one subject said, ‘‘Put his food on the radiator’’ in a context in which the word foot 
was expected. It would be easy to infer from this that the subject suffered from fixa- 
tions, preoccupations, and what not of an oral nature. The current investigation sug- 
gests that such phonetic similarity characterizes most deviations. Scientific protocol 
demands the consideration of this factor of neuromuscular similarity before adopting 
less parsimonious constructs. As evidenced by the prevalence of such words as 
counter-, denial, and reaction in present clinical communications it is popular to 
infer a motivating condition as the opposite of its manifest appearance. The low fre- 
quency of opposites in the present data indicates that the explanatory opportunities 
of such an approach are very limited. Allied with this is the reduced frequency of 
human referents. If “slips of the tongue” are the product of internal ‘instinctual 
cravings’, drives or forces, one would expect that a speaker would more frequently 
become involved emotionally when using words referring to humans as contrasted 
with words with non-specific referents. The present findings reveal that only about 
one-fifth of the deviations occurred in speech referring specifically to humans. 

This reduction in explanatory opportunity does not eliminate an “unconscious 
determination” of interferences with ongoing behavior. There have recently been 
papers on anecdotal levels of evidence that ascribe such irregularities in speaking to 
resistance which in turn is translated into aggression“ *). It is possible that the 
various types of errors reflect increasing resistance. The present writer prefers to 
label these phenomena as an increase in egocentricity and not to identify them with 
internalized forces that specify the exact nature of the difficulty. There is no accept- 
able evidence at present to particularize the cause of the blocking as a specific type 
of internalized conflict. The deviations might well arise from the characteristics of 
the stimuli, from the immediately previous errors which in themselves make the 
subject less efficient and more prone to interruptions, or from developmentally 
acquired idiosyncracies and habitual response patterns. At this point the author can 
only call upon that scapegoat “additional research” for the answers. The complexity 
of linguistic interactions demands that an experimental analysis systematically in- 
vestigate the speaker, his reactional biography, and the stimulus field as they singly 
and in interaction contribute to the speaking process. 


SUMMARY 


This paper reports an attempt to distinguish irregularities in speaking from 
other responses pre-current or current with speaking and to distinguish among var- 
ious types of deviations in speaking. Analysis of recorded TAT stories indicated four 
different types of deviations which occurred with a decreasing frequency and had a 
face validity of representing an increase in egocentricity. The length of these inter- 
ruptions, their referential specificity, and their phonetic similarity to preferred 
speech were counted. The results revealed that a deviation in speaking is most apt to 
be a modification or repetition of what had been said, to be one syllable in length, to 
involve a word without a specific referent, and to be phonetically similar to the pre- 
ferred speech. 
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APPENDIX A 


Principles for Distinguishing Deviations in Speaking from Irregularities In Other Responses 


Pre- or Concurrent with Speaking. 


PRINCIPLE 


1. Such sounds as “ah,” “er,” “hmm,” 
“the”? were not considered in this study. 
1: “And ah he doesn’t seem to be able to ah to 
accomplish much.” 


2. Words or phrases that were incidental ex- 


planations of the speaker’s thinking or verbaliza- 
tions of his difficulties with the task were not con- 
sidered as irregularities in speaking. E.g., 17BM: 
“Tt is common for pole vaulters to practice on on 
the climbing rope because in their, part of their— 
oh what the devil. When they thrust the pole 
into the ground...” 


3. Awkward or inferior words were considered 
to be speaking deviations only when they in- 
volved an obvious lapse from the subject’s knowl- 
edge. E.g.,3BM: “‘Hisself again, himself again.” 


4. If the speech was smooth but illogical in the 
particular context in which it was offered, it was 
considered outside the interest of this’ paper. 
E.g., 19: “‘There’s some type of formation here in 
about the center right of the picture that appears 
to be like a bear with a long tail with the nose of 
an ant eater—hmm.” 


5. If words or phrases were spoken in sequence 
and were separated by connective words, the 
various units of the sequence were considered to 
be appropriate ve page ag of the subject’s 
covert responses. E.g., 14: ‘‘That’s the shadow 
of a boy or a man or it ake be a female.” 

Exceptions to this principle were made in 
those instances in which the subject told the 
audience that he was modifying his speech to ac- 
cord with changes in his thinking or perceiving 
but verbalized these changes by the use of dis- 
organized or incompleted phrases. E.g., 11: “The 
water, not water buffalo but the oh my animal, the 
mountain goat.” 


6. In deciding whether a unit of a series of 
possibly redundant words or phrases that lacked 
connectives was a correction of what had been 
said (a datum) rather than a further discourse or 
expansion, three criteria for inclusion as data 
were specified: 

(a) The addition of a word or phrase which re- 
duced the certainty of the — E.g., 19: 
“Climbing evidently climbing . 


RATIONALE 


1. Although these sounds represent attention 
to the audience they do not, except by inference, 
involve a modification of words and therefore do 
not permit differentiation between interruptions 
in speaking and hesitant thinking or perception. 


2. Phrases such as “IT meant,” “What the devil,’ 
and “As I was saying” represent attempts to in- 
form the audience of the speaker’s thinking pro- 
cesses. There is no evidence of an irregularity in 
speaking in these verbalizations although they 
are frequently preceded and followed by speaking 
deviations. 


3. This is not an investigation of grammatical 
accuracy. Grammatical concepts do not parallel 
psychological behavior (4). 


4. There is no irregularity in speaking and the 
difficulty is therefore presumed to be perceptual 
or thinking rather than verbal. 


5. The connectives are evidence of attention to 
the audience, i.e., the subject is informing the 
audience that he has altered his perception or his 
story. 

In these instances the speech may have ini- 
tially reflected blocks only in perception or think- 
ing but the speaker failed to verbalize these com- 
pletely and thereby produced an irregularity in 
speaking. 


6. When a subject reduces the definiteness of 
his comments, improves his grammar or logic, or 
contradicts his own speech he has probably said 
something which he did not care to let stand. 





SPEECH BEHAVIOR AND EGOCENTRICITY 


(b) The use of a word or phrase that improved 
the grammar or logic. E.g., 9BM: “Evidently 
traveling, migrant workers...” 
(c) The use of opposed referents in the datum 
and the story. £.g., 12M: ‘Head is under, is over 
a pillow Pl 


APPENDIX B 
Principles by which speaking irregularities per se were classified. 


PRINCIPLE 


Contractions were counted as single words. 


2. In identifying each speaking deviation, the 
smallest possible number of words was included. 
3. If two or more irregularities were juxtaposed 
a single datum was counted and classified ac- 
cording to the more serious type. E.g., 15: “Has 
some kind of key or so in (II) which he could 
open up some of the coffins ... 


4. In eases of a single deviation in which there 
was doubt as to which type should be scored, the 
irregularity was placed in the lower numbered 
type. 

5. An incomplete word or phrase was scored as 
Type I or II unless it was essential to the compre- 
hension of the story. In these instances a missing 
word, Type IV, was scored. E.g., 10: ‘“/t's (its?) 
well they have been sitting down...” (This was 
scored IIT but could have been interpreted as a 
word missing after it’s and scored at that point as 


IV) 

E.q., “This picture seems to depict a pre- 
historic: a era in which — seems to be two ani- 
mals.’’ (Since it would be impossible to delete 
“in which — seems to be two animals”’ and main- 
tain the coherence of the story, IV was scored) 
6. Ina series of words or phrases repeated in a 
redundant manner, the last was considered to be 
the preferred and the earlier one of the series 
counted as a datum. E.g., 20: “The individual 
standing behind, beside the lamp light .. .”’ 

The only exceptions occurred in those cases in 
which such a procedure would violate the logic of 
the story or grammatical custom. £.g., 11: “He 
got out of it the wrong way, the wrong direction 

” 


7. In order to differentiate between Type I and 
Type II irregularities that were composed of un- 


necessary words or phrases with non-specific 
referents, a “test”? was applied. The story was 
read with the inclusion of the potential data and 
the omission of both its assumed resumption and 
any intervening material. If the reading left a 
product that was logical and relatively complete, 
the data were considered to be resumed and 
therefore classified as Type I. If the product was 
illogical, the data were considered to be of Type 
II. E.g., 9BM: “They've had to, they’ve become 
strong.” (This can be read logically as ‘““They’ve 
had to... become a and is therefore con- 
sidered to be Type I). E. g., 2: “The man in the 
picture has well is st: nding at the moment.” (Has 
cannot be included in this statement and leave a 
logical product and is therefore considered as 
Type II). 


1The word data is used to refer to irregularities in speaking and the word stor: 
TAT stories which are not interruptions in speaking. Data are identified by un 


RATIONALE 


1. An effort was made to minimize the devia- 
tions in order to reduce experimenter-imposed 
evaluation. 


2. #1 


es #1 


4. Less egocentric irregularities are more prob- 
able than more egocentric ones. 


5. This procedure specifies the less serious de- 
viation. 


6. A subject is more apt to make an error and 
correct it than he is to make a correct response 
and “‘spoil’’ it 


7. Maintenance of logic indicated continuity of 
topic. The speaker has not abandoned his topic 
but is altering his verbalizations about it. 


ry to those parts of the 
erlining. 





VOCATIONAL TESTS AS MEASURES OF PERFORMANCE OF 
SCHIZOPHRENICS IN TWO REHABILITATION ACTIVITIES! 
BERNARD A. STOTSKY 


Veterans Administration Hospital, Brockton, Massachusetts 


PROBLEM 

The treatment of schizophrenia is moving, to an increasing extent, in the direc- 
tion of vocational rehabilitation. Special attention is being focused on chronic schizo- 
phrenics who have not responded well to other therapies. VA hospitals have ini- 
tiated extensive rehabilitation programs whose major goal is that of returning 
chronic psychotics to their communities and homes as productive workers even 
though residuals of their mental illnesses may remain. With the increasing employ- 
ment of rehabilitation methods, the psychologist in the mental hospital setting is 
confronted with new problems such as the proper selection of patients for these pro- 
grams and the investigation of those factors which may be significantly related to 
success in rehabilitation activities. 

Little work has been done by psychologists in this area. No studies have been 
published in psychological journals dealing with the use of aptitude and interest 
tests for predicting success of mental patients in rehabilitation activities. Although 
unexplored, the problem of adequate selection of patients for these activities is of 
major concern not only to the rehabilitation therapists but to all personnel con- 
cerned with the treatment of the schizophrenic patient. Proceeding on the assump- 
tion that measures of aptitude and of interest could be of assistance in dealing with 
this problem, the author sought to develop measures for differentiating patients 
who did well in rehabilitation activities from patients who did poorly and to deter- 
mine the relative importance of aptitude and of interest as predictors of the perform- 
ance of hospitalized schizophrenic patients in two rehabilitation activities. 


MATERIALS 

Setting. This study was performed at a new 940 bed VA neuropsychiatric hos- 
pital specializing in the rehabilitation of chronic mental patients, mostly schizo- 
phrenics. The Physical Medicine and Rehabilitation Service (PMRS) plays the key 
role in rehabilitation since it includes the following activities: Manual Arts (MAT), 
Educational (ET), Occupational (OT), Corrective (CT), Physical (PT), and In- 
dustrial (IT) therapies. Patients are assigned to these therapies by the psychiatric 
team upon the recommendation of the counseling psychologist and the Chief of 
PMRS. MAT and ET were selected for study here for two reasons: (a) Vocational 
tests are used for selecting patients for individual, intensive instruction in MAT and 
ET activities. (b) Patients assigned to these two activities are regarded by the psy- 
chiatric team as having the best prognoses for rehabilitation and are prepared first 
for full day work assignments in regular jobs at the hospital and then for gainful 
employment in the community. MAT consists of the following subsections with a 
different therapist for each subsection: woodworking, machine and auto mechanics, 
printing and graphic arts, drafting, radio and electricity. ET consists of one section 
for individual instruction with one therapist providing instruction in all kinds of 
clerical work, operation of office machines, and class preparation in elementary and 
high school subjects. 


Selection of tests. Originally we intended to use only standard commercial tests 
in the study. A trial run on 48 patients from MAT and ET of such interest tests as 
the Kuder Preference Record, California Occupational Interest Inventory, and 
Brainard Occupational Preference Inventory yielded no significant differences be- 
tween good and poor performing patients for any of the interest areas sampled. With 
respect to aptitude, the Bennett Mechanical Comprehension Test, Form AA, and 
the O’Rourke Mechanical Aptitude Test did not differentiate 14 patients rated high 


1From VA Hospital, Brockton, Massachusetts. Acknowledgment is made to A. Brophy, J. Leiber- 


man, D. Lambert, R. Ericson, R. Williamson, P. Blackjohn, P. Ballantine, and M. Novak for their 
assistance. 
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in work performance in MAT from 16 rated low. The Minnesota Clerical Test differ- 
entiated at the .05 level between eight patients rated high and ten rated low in ET. 
In view of these findings commercial tests were used in measuring aptitude for ET 
activities, while for MAT a new measure of aptitude was developed. For both activi- 
ties a new measure of interest was developed which is described below. 

Interest test. Items for the interest test were selected so that they would refer to 
specific activities performed by patients at the hospital; for example, “‘Make beds 
and clean floors’, ‘‘add figures on an adding machine”, ‘‘solder wires together’’, and 
“shape wood with lathe, drill, band saw, chisel, and file”. Social service items were 
eliminated as well as items not relevant to patient activities. Altogether there were 
115 items, 25 dealing with activities related to ET, 57 with activities related to MAT, 
and the remaining 33 referring to other types of work. The distribution of items was 
randomized. All questions were to be answered by circling an L for like, J for in- 
different, and D for dislike. Each L was weighted +1; each J, 0; and each D, -1. 
The test was administered to patients in both MAT and ET in groups ranging from 
two to 12. 

Twenty patients took the test twice at an interval of two weeks. For them the 
average agreement on interest choices was 88.2 percent. Test-retest reliability for 
57 MAT items was .85, for 25 ET items, .91, both significant at the .001 level. The 
test was considered sufficiently reliable for use in this study. The test record was re- 
garded as satisfactorily completed when a patient answered at least 92 items and 
used more than one alternative for his answers. Where only one alternative, such as 
L, was circled for all items, the test record was discarded. 

MAT aptitude test. Since MAT emphasizes primarily mechanical, motor, and 
spatial visualization skills involved in working with tools, assembling and disassem- 
bling machine parts, reading blueprints, making measurements, ete., MAT thera- 
pists were asked to submit items pertaining to skills regarded by them as important 
for success in their activities, work assignments, and projects. By unanimous agree- 
ment among three judges, 68 items were selected from 146 samples submitted by the 
therapists. The items referred to the five main areas of MAT: woodworking, auto 
mechanics, machine work, electrical work, and printing. Drafting items were elim- 
inated because it was felt that presently available commercial tests, such as the Re- 
vised Minnesota Paper Formboard and the Ruch and Case Survey of Space Rela- 
tions were entirely adequate for measuring drafting ability. A sample item from each 
of the five areas is given below: 

Woodworking: ‘‘Which of these tools wouldn’t you use in making a picture 

frame?” 

(a) saw (b) mitre box (c) plane (d) hack saw 
Machine work: ‘‘Which of these tools would not be used in working with sheet 

metal?” (a) plane (b) hammer (c) riveting punch (d) staking tool 
Auto mechanics: ‘‘What would you expect if the temperature gauge read over 

200, and the ampere gauge read ‘minus’?”’ (a) broken fan belt (b) low oil 

level (c) broker generator lead (d) no gas in tank 
Electricity: ““Which one of these is not a basic part of a radio?” (a) antenna 

(b) tubes (c) preamplifier (d) volume control 
Printing: “‘Which one of these is not used in printing?” (a) linotype (b) lathe 

(c) offset (d) press 


Forty-six questions were of the type described above. Eight required reading 
diagrams and recognizing electrical symbols. Six questions required recognition of 
carpenter’s and mechanic’s tools, and eight required correct proofreading. A time 
limit of 40 minutes was set based on the maximum time required by six normals to 
complete the test. Split-half reliability for 43 subjects was .87. 

ET tests. In ET the emphasis in instructing patients is on the motor skills in- 
volved in operating office machines, the verbal skills required for certain clerical 
tasks, and quantitative skills required in keeping books and accounts in a business 
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straight. Five educational therapists, after reviewing a number of commercial tests 
measuring these skills, selected the Revised Beta Examination and three subtests of 
the Detroit Retail Selling Inventory (Opposites, Checking, and Arithmetic) as most 
suitable for administration to their patients. Both tests have high reliabilities which 
are reported in their manuals“: *). It should be noted that these tests were not 
chosen because they were better able to measure ET skills than other tests, but be- 
cause it was felt that they could most easily be used with chronic, disturbed, and re- 
gressed patients for measuring the skills. 


DESIGN 

Subjects. There were two samples: a preliminary sample (Sample I), consisting 
of 48 schizophrenics assigned to MAT and 38 to ET, and a validating sample (Sam- 
ple IT) collected six to eight months later, consisting of 50 schizophrenics assigned to 
MAT and 14 to ET. The subjects ranged in age from 21 to 66, in education from 
fourth grade to one year of graduate work. All but four patients were white. There 
were no cases of definitely established cerebral pathology. The average length of 
hospitalization was 34 months, the range from three months to 26 years, with most 
patients hospitalized continuously more than a year. All patients were regarded 
clinically as in good contact though four were found to be negativistic and one mute 
on testing. Over 90 percent had been considered deluded at some time during their 
illnesses. The majority were diagnosed as either undifferentiated or paranoid schizo- 
phrenics, the remainder as hebephrenic, catatonic, or simple. The range of aptitude 
and work performance sampled in this study was somewhat narrower than that of 
the total hospital population since only nonregressed patients were assigned to MAT 
and ET activities by the psychiatric teams. 


Evaluation of performance in ET and MAT. Before any of the test data were 
collected, each therapist was asked to rate the patients assigned to his activity. A 
23 item scale containing items referring to work habits and skills, work attitudes, 


and interpersonal relations on the job was filled out for each patient by his therapist. 
A sample of each type of item is given below: 
“Does he stick to a job until it is done?” ‘Can he use tools properly?” 


“Ts he interested in learning new things?” “Does he get along well with 
other patients?” 


For each item one of three alternatives was circled by the rater: U for usually, 
S for sometimes, or R for rarely. Each therapist rated only his own patients. For 
every item, the alternative rated by three judges as indicating best work perform- 
ance was given a score of “3’’, the next best a score of ‘‘2”’, and the poorest a score of 
“1”. The total score for a patient was the sum of the individual scores for the 23 
items. To obtain a measure of the consistency of the ratings, each rater was asked 
one month after the original ratings to rank his patients in terms of overall work 
performance. Rho’s were calculated between these rankings and rankings for total 
scores on the work performance scale. They are listed as follows: Drafting (N = 8) 
.89, printing (N = 7) .89, machine work and auto mechanics (N = 12) .80, wood- 
working (N = 21) .77, electricity (N = 12)? .84, and ET (N = 38) .87. All rho’s 
were significant at the .001 level, indicating that therapist ratings were consistent 
from one time to the other. 

The various subsections of the scale, the seven-item work habits and skills, the 
eight-item work attitudes, and the eight-item interpersonal relations on the job sub- 
scales were intercorrelated. Product-moment r’s ranging from .72 to .79 were found, 
indicating a high degree of interrelation among subscales. The r’s between each of 
the subscales and the total scale ranged from .79 to .91. There seemed to be no rea- 
sonable basis for considering subscales separately. Therefore, only the total score 
for the entire scale was used to evaluate work performance in MAT and ET. 

*This group of patients was subsequently dropped from the study when they became unavailable 
either through discharge or dispersal of the patients as a result of the extended absence of the therapist 
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Procedure. 1. In both samples, patients in MAT and ET were divided into two 
groups: a high and a low. The high group consisted of patients scoring above the 
median for their particular activity on the work performance scale, the low group of 
patients scoring below the median of the scale. Patients in each activity were con- 
sidered separately. 

2. The highs and lows for each MAT activity were combined into one high and 
one low group respectively including all MAT patients. For ET this was not neces- 
sary since there was only one large group. 

3. Aptitude and interest test data were then collected by testing patients in 
groups ranging in size from two to twelve. The interest test and the specially de- 
vised MAT aptitude test were administered to MAT patients while ET patients 
took the interest test, Revised Beta, and the three subtests of the Detroit Retail 
Selling Inventory. Eight MAT subjects refused to take the aptitude test; sixteen 
could not be tested for other reasons, such as illness, pass off the grounds, trial visit, 
somatic therapy, etc., leaving a total of 74 who took the test, 40 highs and 34 lows. 
None of the ET patients refused either the Beta or the Detroit, though two patients 
failed to take the former test and three the latter for reasons beyond control. The 
N for the Beta was 25 highs and 25 lows, for the Detroit, 24 highs and 25 lows. The 
interest test was used only in Sample I, with seven patients refusing to take it, 18 
unable to take it for some reason beyond their control, and 13 tests having to be dis- 
carded either because less than 92 items were completed or because of failure to 
select more than one alternative. To avoid any possibly contaminating influence of 
therapist ratings on the selection of subjects, work performance scales were placed in 
sealed envelopes before any of the subjects were seen or chosen for the study. Actual 
selection of patients for testing was made from attendance rosters. An effort to test 
every patient selected was made either on the ward or at the activity. Patients in 
high and low groups in both samples were equivalent with respect to age, education, 
occupation, diagnosis, and length of hospitalization. 

Information concerning the validity of the work performance scale was obtained 
by determining the extent to which a high rating on the scale was related to success- 
ful outcome of treatment as rated three months later. If a patient moved into a 
regular part-time or full-time work assignment at the hospital or was discharged to 
gainful employment in the community, the outcome was rated positive. If he re- 
mained at the same assignment, regressed to a lower level assignment, or was dis- 
charged but not gainfully employed in the community, the outcome was rated neg- 
ative. From Table 1, it can readily be seen that high scores on the scale were signi- 
nificantly related to positive outcome. Although there is the possibility that thera- 
pist judgments, communicated to the psychiatric team by means of progress notes 
and personal discussions, could have influenced the disposition of these patients, the 
likelihood is that the MAT and ET therapists did not play a major role in these de- 
cisions since they are not members of the team. Moreover, team decisions are based 
on many factors besides patient performance in rehabilitation activities. 


TABLE 1. PREDICTION OF OUTCOME OF REHABILITATION TREATMENT BY THE 
Work PERFORMANCE SCALE 
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Groups Outcome 


| 


% Correct 





SAMPLE I 
Patients above median 


Patients below median 


Samp_e IT 
Patients above median 








Patients below median 
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RESULTS 
Highs vs. lows on aptitude. On the MAT aptitude test, highs obtained signi- 
ficantly greater scores than lows for both samples. In ET, the differences for test 
scores on the Beta and the Detroit were also significant in both samples, with highs 
in both instances obtaining higher scores. For aptitude tests, therefore, there were 
clear differences between the groups. 
TABLE 2. CoMPARISON OF HiGcHs AND Lows on AptiTuDE TESTs. 











MAT ET 


Groups Aptitude Test Revised Beta Detroit Retail Selling* 
Mean ¢t Pp Mean t Pp Mean 


SAMPLE I 
High! 43.8 3.% 134.8 
4 : .02 
Low’? 26.1 76.8 73.5 





Samp te II 
High* 47.9 6 156.0 
} 3.9 001 Out .001 
Low* | 30.5 5. 93.2 


| 





*Scores on Detroit consisted of sum of raw scores for the three subtests. 


IN’s for highs: 23 for MAT Apt. Test, 19 for Beta, 18 for Detroit 
2N’s for lows: =~ - - = ~ = 
3N’sforhighs: 17 “ “ “ eo ee - 


’ , 
‘N’s for lows: ee = ™ es = 5s * - 


Highs vs. lows on interest. The two groups in Sample I were compared for interest 
test scores on items referring to MAT and ET activities. None of the comparisons 
yielded significant differences between highs and lows. To explore the possibility 
that the lack of significance might be due to inability to make discriminations for 
such a large number of items, 37 patients were also given a simplified preference test 
in which they were asked to rank 12 rehabilitation activities, three of which referred 
to ET, three to MAT, and six to other work and recreational activities. For these 
patients, the rho between ranks on interest test scores and average ranks for MAT 
activities on the 12-item test was .61. The rho between interest test scores for ET 
and average ranks assigned to ET activities was .64. The fairly high correlation be- 
tween scores on the two interest tests, while not completely ruling out faulty dis- 
crimination as a reason for no differences on interest items, strongly suggests that 
this was not a main reason. Further comparison of interest test scores of 24 MAT 
patients with scores for 24 ET patients, who were equivalent with respect to educa- 
tion, age, diagnosis, and length of hospitalization, showed, using t tests, significant 
differences for the 25 ET items (t = 2.5, p. of .02) and for the 57 MAT items (¢ = 2.3, 
p of .03) with ET’s showing significantly greater preference for ET activities and 
MAT?’s for MAT activities. The significant differences for interest scores between 
groups in ET and MAT, as well as the lack of differentiation between highs and lows 
within the activities imply that the failure of the interest items may be due to a lack 
of measurable difference between highs and lows for interest variables. The interest 
test was not used in Sample II. 


Predicting outcome of treatment. Predicting the outcome of any type of treat- 
ment in a mental hospital is a complex task. Many factors, such as other forms of 
treatment, attitudes and competence of professional staff, attitudes of administra- 
tive personnel and hospital aides, the family’s attitude and behavior, hospital ad- 
ministration and policy, economic conditions in the patient’s community, other en- 
vironmental stresses, and finally changes in rehabilitation programs through staff 
turnover, assignment of new patients, policy changes, etc., influence outcome. 








VOCATIONAL TESTS AS MEASURES OF PERFORMANCE OF SCHIZOPHRENICS 241 


Nevertheless, a test to be clinically and diagnostically useful, must be able to fore- 
cast effectively future behaviors and outcomes. 

An attempt was made to determine the extent to which the aptitude test could 
predict outcome of treatment. The same criteria for evaluating outcome which were 
utilized for the work performance scale were also employed for the aptitude test. 


TABLE 3. PREDICTION OF OUTCOME OF REHABILITATION TREATMENT BY APTITUDE TESTS 








Sample I Sample II Total 
% 


oO 
Test Outcome correct p Outcome correct p 
MAT Apt. Test* 
> 37 ; 32 
Scores : ‘ 80 .001 
<37 8 


0 
| Outcome correct p 





Rev. Beta* 
> 80 ‘ 28 


<s80 
> Med. 
< Med. 


Det. Ret. Sell.* 
> 90 


Scores 


Scores 


Scores 


7 
<90 4 0 
6 


>Med. | 13 5 
Scores | 62 .24 
<Med. | 9 10 1 











*Median scores in Sample I: MAT Apt. Test 37 
Rev. Beta 93 
Det. Ret. Sell. 108 


The MAT aptitude test predicted significantly better than chance in both instances, 
with the degree of accuracy being sufficiently high to make the test useful for pur- 
poses of selecting patients for MAT. For the Beta and Detroit two comparisons were 
made in Sample I, one at the median and the other at the point of maximum differ- 
entiation of highs and lows. Both cutting points for the Beta continued to differ- 
entiate highs and lows significantly in Sample II. On the Detroit, a score of 90 pro- 
vided the best differentiation in Sample I, while the median score did not discrim- 
inate beyond chance. However, in Sample II, differences for the former did not reach 
significance, while for the median differences reached the .01 level. Scores for both 
groups combined yielded significant discriminations for all five cutting points. 


DIscussION 


Aptitude tests were far more successful than interest tests in differentiating the 
more highly rated patients from the less highly rated ones in MAT and ET. While 
interest in a particular area is essential for participation, the lack of significant re- 
lationship between interest choices and success in these activities supports the em- 
ployment of suitable aptitude tests in screening the patient population for potential 
candidates for intensive instruction in MAT and ET. Whereas, in the past at this 
hospital, selection of patients for MAT and ET was on the basis of expressed interest 
in one or the other area, with aptitude testing being performed only after the patient 
manifested difficulty in adjusting to the demands of his work, the findings indicate 
that aptitude testing can be of great value as an aid in determining the assignment. 
The aptitude tests were also able to predict the outcome of treatment satisfactorily. 
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That they can also be utilized for prognostic purposes in planning treatment is en- 
couraging. 

Patient performance on the aptitude tests and in the more advanced rehabilita- 
tion activities, such as MAT and ET, seems to be, within certain limits, a reasonably 
reliable indicator of future adjustment in more responsible positions either in the 
hospital setting or in the community outside. The limits are defined by the nature of 
the environmental situation facing the patient extramurally. With a supportive 
home environment in which he does not have to support himself by working, a 
patient doing poorly may manage to maintain himself in the community while a 
patient who is doing better, confronted with severe family problems and other 
stresses, may give way. Possibly the best prediction would be obtained through the 
inclusion not only of aptitude and performance measures, but also an assessment of 
the family situation, social and economic factors confronting the patient when he 
leaves the hospital. 

From the point of view of personality theory, the success of the work perform- 
ance scale and of aptitude tests in discriminating between groups and predicting 
outcome serves to demonstrate the value of approaching the problem of assessment 
and prediction at the level of aptitude and performance, the so-called ego level. In 
attempting to forecast an outcome defined in terms of successful job performance, 
the use of instruments related to the phenomena being predicted has proven success- 
ful. 

It should be noted that this is only a first attempt to apply vocational tests in 
a systematic manner to rehabilitation activities. Although preliminary results are 
encouraging, further refinement of measures is indicated. The findings strongly 
support the original assumption that vocational tests and scales could be developed 
or adapted for use with neuropsychiatric patients which would predict at a level 
high enough to be useful to the psychiatric team in planning rehabilitation programs 
for patients. 


SUMMARY 


1. Vocational tests of aptitude and interest, developed or adapted for use with 
a predominantly chronic group of schizophrenic patients, were studied in terms of 
their ability to differentiate patients rated high in performance in two rehabilitation 
activities from patients rated low and also in terms of their ability to predict the 
outcome of rehabilitation treatment. An attempt was also made to evaluate the 
relative importance of aptitude and interest for success in these activities. Two 
samples, the first consisting of 86 schizophrenics, and the second of 64 schizophrenics, 
drawn from the Manual Arts and Educational Therapy subsections of the Physical 
Medicine and Rehabilitation Service, were studied. 


2. The three aptitude tests which were employed discriminated between high and 
low patients at significant levels of confidence for both samples. No significant differ- 
ences were obtained on the interest test although patients in one subsection differed 
significantly from patients in the other on interest scores. 


3. The aptitude tests were able in all but two instances to predict outcome of treat- 
ment significantly better than chance. Aptitude seems to be more important than 
interest for successful performance in this type of rehabilitation activity. 


4. Vocational tests adapted for use with neuropsychiatric patients can be of assist- 
ance in planning rehabilitation treatment and in predicting its outcome. 
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A STUDY OF THE RELATIONSHIPS BETWEEN SELF-RATINGS AND 
PARENT-RATINGS FOR A ‘GROUP OF COLLEGE STUDENTS 
LOUIS G. PORTER AND CHALMERS L. STACEY 


Syracuse University 


PROBLEM 

One of the basic assumptions of Freud’s theory of personality is that the child 
identifies with parental value-systems. Until recently there has been much theoreti- 
cal discussion, but little systematic experimental attention paid to the mechanism of 
identification. Reiter ®? reported that identification is not related to either a positive 
or a negative affective relationship with the parent as measured by the recall of out- 
standing childhood experiences. Sopchak “ found evidence to support the hypothesis 
that parental identification by the child is associated with normality and good ad- 
justment, as measured by the Minnesota Multiphasic Personality Inventory. Tend- 
encies toward abnormality, in general, are associated with failure to identify with the 
parents. This generalization is true for both male and female subjects in so far as 
identification with the father is concerned. In a similar methodological approach to 
the problem of identification, Beier and Ratzeburg“? reported that male college 
students identify more readily with the father, and female students more readily with 
the mother. They reported, further, that when males identify strongly with the 
father they are likely to ascribe more than average femininity values to the mother, 
as measured by the M-F scale of the MMPI. 

The present study was designed to investigate the degree of relationship which 
exists between an individual’s self-rating and his rating of a parent of his choice on a 
number of personality traits. Specifically, the problem was to determine if an indi- 
vidual’s identification with his parents is selective in terms of particular personality 
traits. For the purposes of the present study, identification is defined as the tendency 
to rate both self and parent in a similar manner. Implicit in this definition is the 
assumption that when an individual is free to select a parent whom he is going to rate 
he will select the parent he knows best. It is inferred that the chosen parent is likely 
to be the one with whom he has most closely identified. 

For purposes of the present study, it is further assumed that both item common- 
ality and correlation between trait scores are fruitful methods of ascertaining the 
degree of child-parent identification. It must be remembered, however, that it is 
unlikely that an individual can really understand the complex personality dynamics 
of his (or her) parent. Therefore, the relationships to be studied are considered to be 
those existing between the individual’s concept of himself and his concept of his 
parents. 


MetTHOD 


Subjects. Two hundred fifteen students in general introductory psychology and 
mental hygiene classes at Syracuse University rated themselves and their preferred 
parent on the Guilford-Zimmerman Temperament Survey. Of this total, 121 were 
females and 94 were males. Their ages ranged from 17 to 23 years. 


Administration and Scoring. The two administrations of the test (self-rating and 
parent-rating) were given by the experimenters during regular class periods. The 
subjects were assigned numbers and told to sit in alternating seats to insure some 
degree of anonymity. In order to instill an incentive for answering honestly, the re- 
search nature of the study, the freedom of choice in participating, and the avail- 
ability of personalized reviews were discussed with the subjects. In order to equalize 
transfer effect, four of the eight classes studied were instructed to rate themselves 
first, while the remaining four classes were asked to rate their parents first. The two 
administrations were separated by an interval of one week. At the time of the first 
administration, subjects were unaware that later they would be requested to parti- 
cipate in the second administration. 
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The Guilford-Zimmerman answer sheet is marked with a Yes, No, or question 
mark response. The subjects were told not to use the question mark response, but to 
limit themselves to the use of Yes or No. Standard procedures were carried out for 
the self-ratings. When rating parents, the following revised directions were given: 
‘Rate one of your parents, e.g., the parent you feel most competent to rate, on each 
item according to your observations of his (her) behavior. If you have difficulty in 
answering a particular item, try to make a judgment based upon knowledge of other 
areas of the parent’s behavior.” 

The analysis of the ratings of self and of parent consisted of two parts. The first 
part was concerned with commonality, which is the percentage of item agreement 
between the two ratings. That is, the percentage of items on which the subject gave 
the parent the same rating (Yes or No) as he gave himself. The second phase of an- 
alysis consisted of computing Pearson product moment correlations between the 
respective trait scores obtained from the two administrations. In analysis of the test 
results, subjects were divided into four groups: (1) males who rated their fathers 
(M/M); (2) males who rated their mothers (M/F); (3) females who rated their 
mothers (F/F); and (4) females who rated their fathers (F/M). Various combina- 
tions of these four groups were used in analyzing the data. 


RESULTS 


Commonality between items. Table 1 presents results relating to the commonality be- 
tween the subjects’ self-ratings and their ratings of preferred parents for the ten 
different traits. Since the elimination of the question mark permits only four combin- 
ations of responses to a particular item (i.e., Yes-Yes, No-No, Yes-No, No-Yes), by 
chance one would expect to obtain agreement (commonality) on 50 per cent of the 
items. The results shown in Table 1 indicate that for the nine groups the separate 
trait and total (for all 300 items) commonalities were all significantly different from 
50 per cent (p was equal to or less than .01), with the exception of trait M in the case 
of subjects who rated their parents of the opposite sex. With this one exception, the 
subjects showed a positive relationship between the way in which they see them- 
selves and the manner in which they describe their parents. 


TaBLe 1. Commonauity (PER CENT oF ITEM AGREEMENT) BETWEEN SuBJEcTS’ SELF-RATINGS AND 
Tuerr RATINGS OF PARENTS OF THEIR CHOICE. 











M/M M/F F/F F/M Opposite Same Males? Females‘ Total 

Sex! Sex? Trait 

Number of Commonality 
Subjects 5 43 94 215 


o 





Traits % % % % 
G iy 6 60 j 61 ) 63 
R j 68 ) ) 65 j 


A 5 iY ) 61 64 
Ss ) ‘ y 69 70 
67 ) 65 
69 if 67 
66 j 67 
63 De de 61 


72 70 
55* 


Total Item 
Commonality 66 5 j 65 65 








*Not significantly different from 50% at the .01 level. 


Opposite sex means the combination of groups M/F and F/M 
Same sex means the combination of groups M/M and F/F 
Males means the combination of groups Mt /M and M/F 
Females means the combination of groups F/F and F/M 
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Trait commonalities for all nine groupings were not significantly different from 
their respective total commonalities, with the exception of trait M for the subjects 
who rated their opposite-sex parent. For these subjects, although commonalities for 
trait M were not significantly different from 50 per cent, they were significantly 
different from the total commonalities. The total of the trait commonalities for all 
the groups averaged 66 per cent agreement between the two ratings. Thus, the re- 
sults indicate a significant degree of identification of the subjects with their pre- 
ferred parents for all traits. This conclusion is inferred from a significantly greater 
than chance agreement between the way subjects rated theselves and the way they 
rated their parents. 


TABLE 2. RESULTS OF THE CR Test BETWEEN TRAIT COMMONALITIES (SEE TABLE 1) FOR DIFFERENT 
GrovupINGs. INCLUDED ARE ONLY THOSE TRAITS ON WHICH SIGNIFICANT DIFFERENCES WERE FouND. 





Traits M/M M/M M/M M/F /¥F ‘/F Opposite Males 


Us. vs. vs. vs. vs. 18. vs. vs. 
M/F o/F Same Sex Females 


G 
R 
© 
F <.01 
T 
M <.01 
Total Item 
Commonality <.01 <.01 


Table 2 presents the results of the CR test between trait commonalities for the 
different groups (see also Table 1). Traits A (ascendence), 8 (sociability), E (emo- 
tional stability), and P (personal relations) were omitted from the table because no 
significant differences between groups were found for these traits. Of the remaining 
six traits, differences between groups on trait M could have been predicted on the 
basis of normal sex difference. For trait G (general activity), the results indicate that 
females who rated their mothers (I°/F showed significantly greater agreement than 
did the M/M and F/M groups. Subjects who rated their same-sex parents also 
showed a greater commonality than did those who rated their opposite-sex parent. 
This difference can be attributed in large part to the F/F group. On trait R (res- 
traint) males who rated theirsfathers (M/M) showed a significantly lower common- 
ality than did the F/M group. 

Females who rated their mothers (F/F) showed significantly higher common- 
alities on traits O (objectivity) and T (thoughtfulness) than did males who rated 
their fathers (M/M). When the subjects were combined into groups based upon the 
sex of the subjects, males showed a greater number of common responses with their 
parents on trait O than did females. The reverse was true for trait T. On the remain- 
ing trait F (friendliness), males who rated their mothers showed greater agreement 
than did males who rated their fathers. These results suggest that, as far as item 
agreement or commonality is concerned, identification varies from trait to trait ac- 
cording to the sex of the individual and the parent rated. 


Correlations between trait scores. Table 3 presents correlations for each trait between 
the subject’s self-rating and his rating of his preferred parent for the various groups 
of subjects. From Table 3, it can be seen that the correlations ranged from —.48 to 
.77. For the different groups, trait P consistently yielded the highest correlations, 
while trait R consistently yielded the lowest correlations. In addition, trait R did not 
show a significant relationship between self-rating and parent-rating for any of the 
groups. Since this (R) was the only trait which did not yield a significant correlation, 
it may be inferred that it is the only parental trait with which the subjects do not 
measurably identify. 
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TaBLeE 3. CorrELATIONS BETWEEN Supsects’ Trait Scores AND TRAIT ScoRES OF PARENTS RATED 
By Tue Suspect. 





F/M Opposite Same Males Females Total 
Sex Sex 
Number of . 
Subjects 56 3 ‘ 4s 1 
Traits 
G 2 24 “i ‘ 18 ; : .18 .23* 
-.O1 Ol 0. ; .O8 
.30 ol a 3: 2d 
12 .41* 2 .O8 .30* 
.34* .50* 45 a 04 
.54* .55* od o .53* 
.43* oo 48 4 .44* 
.34* 07 36 0: .02 
.44* a .66 on .68* 
.40* : .4 —.48* 


215 


*Significant with p equal to or less than .01 


Traits which yielded significant correlations for all groups were O, F, and P. 
From this it may be interpreted that the subjects tend to identify with their parents 
most frequently and to the greatest degree on these traits. Trait M showed the fol- 
lowing relationships: (1) the highest correlation was between females and their 
mothers (.47); (2) no correlation between males and their mothers; and (3) a sig- 
nificant negative correlation for the combined groups who rated their opposite-sex 
parent. Although not strictly comparable, these results for trait M tend to confirm 
Beier and Ratzeburg’s findings that subjects tend to identify more closely with 
parents of the same sex as themselves. The high correlation on trait M for females 
who rated their fathers suggests that the ‘“‘masculine” female may be inclined to 
identify more strongly with her father than is true of males. 

Significant correlations for trait T were found only in the case of subjects who 
rated parents of their own sex. This result suggests that subjects tend to identify in 
thoughtfulness only with their same-sex parent. A similar tendency was found on 
traits G, A, and E. In contrast to this, significant correlations were found for trait 
S in the case of three groups: (1) M/F, (2) opposite-sex, and (3) males. This finding 
would seem to indicate that only males who prefer to rate their mothers show a sig- 
nificant degree of identification on this trait. 

Although many of the trait correlations were significantly different from zero, 
only a few significant differences between the various groups of subjects were found. 
Results of the CR test of differences between the trait correlations for the different 
combinations are summarized in Table 4. 


TaBLe 4. Resuuts oF THE CR Test BETWEEN Tratr CorRELATIONS (TRANSFORMED TO Z ScorEs) 
For DIFFERENT GROUPINGS. INCLUDED ARE ONLY THOSE TRAITS FOR WHICH SIGNIFICANT DIFFER- 
ENCES WERE Founp. 








raits M/M M/M M/M M/F M/F F/F Opposite Males 
5 vs. 8 vs 


vs. vs. vs. vs. vs. vs. . 
M/F F/F F/M F/F F/M F/M Same Sex Females 














<.01 
<.05 





The significant difference on trait E between the same-sex group and the op- 
posite-sex group indicates that subjects who rated their opposite-sex parents do not 
identify with the parent on emotional stability. Table 5 shows that the mean score 
on E for males who rated their mothers was lower than the published norms. Males 
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who preferred to rate their mothers appear to be slightly more emotionally unstable 
than is true of the other groups. This result is consistent with Sopchak’s findings. 

Groups who rated parents of the same sex as themselves showed a significantly 
greater degree of identification on traits T and M than did those who rated their op- 
posite-sex parents. However, group F/M yielded a significantly greater degree of re- 
lationship than did group M/F. This finding is in sharp contrast to the results of the 
item analysis and may be explained by the scoring procedure. A significant differ- 
ence between groups on trait P indicated a greater degree of identification with 
parents in the case of females. Females identify most strongly with their fathers, and 
do so to a significantly greater extent than is true of males. 


Tasie 5. Mean Trait Scores For Suspsects’ SevF-RATINGS AND RATINGS OF PARENTS TABULATED 
FoR VARIOUS GROUPINGS. 








Traits M/M M/F F/F “/M_ Opposite Same Males Females Total 
Sex Sex 








Gc 16 17 18 17 17 17 16 18 17 
| ag 20 19 21 22 20 20 20 21 20 


) 17 18 18 18 © 18 18 18 18 
r 21 22 21 21 21 21 21 21 


8 19 18 18 17 18 18 17 18 
P 21 17 20 y 20 21 20 21 20 
21 2% 22 22 22 22 22 22 
22 ‘ 23 23 23 22 23 23 


hm 


18 ! 18 18 18 18 18 
21 18 19 19 20 19 19 


19 g 20 20 20 19 20 20 
21 19 19 20 20 20 20 


) 
P 
8 
P 


- 
lao? 2) 


16 5 17 16 16 15 17 16 
13 5 15 14 14 14 14 14 


21 19 20 19 20 19 19 
17 17 16 : 18 16 17 17 17 


wm 


18 18 19 19 19 18 19 19 
17 18 17 18 17 17 17 17 


mm 


21 20 12 y 16 21 12 16 
23 10 13 23 17 17 18 17 17 


rym 





*S means self-rating 
*P means subject’s rating of parent 


Comparison with published norms. Mean trait scores for the subjects’ self-ratings and 
parent-ratings are presented in Table 5. A comparison of the subjects’ mean trait 
scores with normative data published by Guilford and Zimmerman reveals interest- 
ing differences. The present subjects rated their parents as possessing traits G, R, A, 
S, E, and M to a greater degree than was true of themselves. On the other hand, 
they rated themselves as possessing traits F, T, and P to a greater extent than their 
parents. In the case of trait R, for example, it is observed that the mean trait score 
for the subject’s self-rating is 18, while the mean parent-rating is 21. It is also noted 
that the mean score for this trait reported in the Guilford-Zimmerman manual is 
16.4 for both sexes. This suggests that the subjects judge their parents to be more 
restrained than themselves. The same interpretation would seem to apply for traits 
G, A, 8, and E. 

The results compiled in Table 5 raise the question of whether there are real 
personality differences between subjects who preferred to rate their same-sex parent 
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and subjects who preferred to rate their opposite-sex parent. There is some indica- 
tion that males who rated their mothers tended to be somewhat more emotionally 
unstable and more impulsive than males who rated their fathers. This finding is in 
line with that reported by Sopchak. 


SUMMARY 


This has been an exploratory study to investigate the degree of relationship 
which exists between the way an individual rates himself and the way he rates one of 
his parents on a paper and pencil test of personality. A sample of 94 male and 121 
female college students were asked to rate themselves and a parent of their own pre- 
ference on the Guilford-Zimmerman Temperament Survey. The use of this instru- 
ment permitted a comparison of ten different trait scores to discover whether identi- 
fication of subject with parerit is selective in terms of the different traits. The data 
were evaluated in terms of item commonality and correlation between trait scores. 

The results of the item analysis support the hypothesis that identification with 
the parent is selective according to particular traits. The total amount of item agree- 
ment or commonality is approximately two-thirds. Identification varies from trait 
to trait and depends upon the sex of the subject and the sex of the parent rated. The 
specific findings were: 

1. Males identified more closely with their fathers on masculinity, and with 
their mothers on friendliness. 

2. Females identified more closely with their mothers on general activity, 
objectivity, thoughtfulness, than did the males with their fathers. 

3. Males identified more closely with their fathers on masculinity than did 
females, while females identified more closely with their fathers on restraint. 

4. Females identified more closely with their mothers on general activity and 
femininity than with their fathers. 

5. Subjects who rated their same-sex parents identified more closely with that 
parent on general activity and masculinity-femininity than did those who rated 
their opposite-sex parent. 

6. Males identified more closely with their parents on objectivity, while 
females identified more closely on thoughtfulness. 


The results of the correlations between trait scores also support the hypothesis 
that identification with the parent is selective according to particular traits. The 
significant findings were: 


1. Subjects who rated their same-sex parent identified more closely on emo- 
tional stability, thoughtfulness, and masculinity-femininity than did those who rated 
their opposite-sex parent. 

2. Females identified more closely with their parents on personal relations 


than did the males, and this was especially true for females who rated their fathers. 


3. ‘Masculine’ females identified more strongly with their fathers than did 


males on the masculinity-femininity scale. 
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AN EXPERIMENTAL STUDY OF THE EFFECTS OF INDIVIDUAL AND 
GROUP PRESENTATION OF THE RORSCHACH PLATES! 


J. H. ROHRER AND BARBARA W. EDMONSON 


Urban Life Research Institute, Tulane University 


PROBLEM 


This study was designed to test three hypotheses related to problems of Ror- 
schach testing. An earlier study of a technique for presentation of standard 
Rorschach cards to groups suggested two of the hypotheses: (a) The protocols ob- 
tained by a group administration of the Rorschach plates produce essentially the 
same frequency distribution in the various scoring categories as would be obtained 
by individual administration of the cards, and (b) Group administration of the 
Rorschach plates results in a protocol that more validly reflects an individual’s 
psychological dynamics since distortions due to examiner-examinee interactions are 
minimized under conditions of group administration. In addition to these two hy- 
potheses this study was designed to permit testing of a third hypothesis: (¢) Suc- 
cessive presentations of the Rorschach cards produces significant changes in types 
of responses given by superior adult subjects. 


METHOD AND PROCEDURE 


Subjects. The subjects (Ss) in this experiment were male undergraduate students 
attending Tulane University and enrolled in the Air, Army or Navy Reserve Officers 
Training Corps. All Ss volunteered for the study for which they were paid an hourly 
rate. The ages of the Ss ranged from 18 years 7 months, to 22 years 3 months, with 
a mean age of 19 years 10 months. Subjects were chosen for a particular testing 
session because of their availability at a specific time, as determined from their class 
schedules. 

A 2x2x2 experimental design was used thus permitting all combinations of the 
three variables manipulated; 7.e., group or individual presentation, order—group 
first or second, and examiner—A or B. Each 8 was randomly assigned to one of the 
eight possible experimental conditions. Six Ss were tested under each condition, 
making a total of 48 Ss. At the time of his initial interview, each 8 was told that the 
experiment in which he was participating would require three sessions, two devoted 
to psychological testing and a third to an experiment in visual performance (an 
autokinetic situation). He was also informed that after one session of psychological 
testing he would be interviewed briefly about “his attitudes, background and so 
on.”’ This interview, which was conducted by a psychiatrist, always occurred after 
the individual Rorschach test. 


Individual Rorschach Administration. Administration and inquiry of the in- 
dividual Rorschach followed the procedure discussed by Klopfer and Kelley “. Be- 
cause of the nature of the experiment, no “testing of the limits’”’ was done. Two 
women examiners administered the test. Twenty-four of the Ss were seen by Ex- 
aminer A, 24 by Examiner B. 


Group Rorschach Administration. The technique for presenting the Rorschach 
plates to the group has been described elsewhere’. Briefly, the plates were pro- 
jected in sequence, in a darkened room, onto a glass beaded screen. The Ss wrote 
Rs in booklets. Individual inquiries were carried out with each S at the conclusion of 
the group session. The Ss were tested in groups ranging in size from 7 to 11 persons. 
The mean time interval between the two testing sessions was 9.5 days. 


1This study is one of a series carried out under ONR contract Nonr-475(01) with Tulane Uni- 
versity. While this paper constitutes a technical report to ONR, the interpretations presented herein 
are those of the authors and do not represent, necessarily, those of the sponsoring agency, the Depart- 
ment of the Navy. 
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With those Ss who had the group administration first, the instructions were the 
same as used previously “’. When the group session followed the individual one, the 
instructions were replaced with a statement that these were the same blots that the 
Ss had seen in the individual session, but in this session the blots were to be projected 
on a screen and they were to write their answers in booklets rather than verbalize 
them. It was emphasized that they were to make no effort to remember what they 
had seen before or to deliberately look for new perceptions. Similar instructions were 
given to Ss whose second test was an individual one. 


Scoring. The protocols were scored primarily according to Klopfer’s method. 
Beck’s“: ?) scoring statistics were used where applicable. The scoring of D, Dd, 
F+, and P strictly followed Beck. A detailed discussion of the scoring technique 
used is to be found elsewhere“. 


RESULTS 


The experimental design used was a factorial one aimed at permitting the statis- 
tical evaluation of the effects of manipulating three variables. These variables were: 
(a) Order of presentation (group first or second); (b) examiner (A or B); (c) type of 
card presentation (individual or group). In preparation for applying variance analy- 
sis to the data, Bartlett’s test for variance homogeneity was performed on data from 
four categories previously shown ®: © to be distributed normally (and thereby satis- 
fying one assumption necessary for the proper use of the analysis of variance tech- 
nique). The categories chosen were R, W, D, and Dd. M was also included because of 
its importance to Rorschach interpretation, although it was known that its distribu- 
tion is generally J-shaped. As shown in Table 1 only the variances for D and M did 
not exhibit heterogeneity. Square root and logarithmic transformations of the R data 
obtained from the group presentation were made and Bartlett’s test was repeated. 
As shown in Table 1, variance heterogeneity was still in evidence. Thus, we were 
forced to conclude that the pooled data by type for each of the two order groups were 
not drawn from a common population and hence the plan for use of variance analysis 
to evaluate the observed differences had to be abandoned. 


TaBLeE 1. Curt SQUARE VALUES FOR BARTLET?T’S TEST OF 
SELECTED RorscHacu Scorinc CATEGORIES 








Type 
Presentation Scoring Category 


w | oD | | M 
Group | 19.26** | 13.83 34.4 8.66 











Individual |} 12.92 | 6.87 | 19.09** 5.35 





*Significant at 5% level of confidence 
**Significant at 1% level of confidence 


Order of card presentation permits the identification of two groups, A and B. 
Visual inspection of the data, presented in Table 2, revealed that while groups A and 
B, for a given administration, showed marked differences in the means of their res- 
pective frequency distributions, there appeared to be a rather systematic change in 
mean frequencies from first to second administration. However, before we could 
evaluate the statistical significance of the observed shifts, it was necessary to de- 
monstrate that the order of type of presentation was not a factor influencing the 
observed shifts. In order to test this hypothesis concerning order of type of pre- 
sentation, the raw data in each scoring category for each 8, were converted to per- 
centages and in turn, angle transformations of the percentages were made. Next, 
for each group, the difference in each scoring category between first and second 
administrations was determined, taking into account the sign of the change. Thus, 
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TABLE 2. MEANS OF THE RAw Data FREQUENCIES OBTAINED IN EACH 
RORSCHACH SCORING CATEGORY 














Scoring Group A (N = 24) | Group B (N = 24) 





Category | Ind. lst | Group ond | "Group et ink Qnd 
| 30.% | 36.62 23 .92 30.38 
7.50 9.2: .58 
10. 
3 . 54 
0.4: 
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there resulted an array of transformed percentage scores for groups A and B. Stud- 
ent’s t test was applied to each of the category differences between the A and B 
groups. None of the obtained values approached significance (the most significant 
difference had a p < .50). Thus, we were assured that the observed differences be- 
tween groups A and B were not due to the type of presentation of the Rorschach 
cards. 

We were then free to test for the effects of the second administration. For eval- 
uating this effect, Wilcoxon’s test for paired replicates“! was applied to the differ- 
ence in percentages in each category for the combined A and B groups, taking into 
account the direction of the sign change. Table 3 presents those scoring categories in 
which statistically significant shifts occurred and the direction of the shift. It will 
also be noted in Table 3 that number of Rs, when tested by the Wilcoxon test, also 
increased significantly on the second administration. By combining the A and B 


TABLE 3. SIGNIFICANT CHANGES IN FREQUENCIES OF SCORING CATEGORIES 








Scoring Significance level at Significance level at 
Category which category which category 
increased decreased 














F% Beck 


F% Klopfer 


R 











groups one can evaluate, through the use of Wilcoxon’s test, shifts in percentages 
due to type of card presentation. In order to make this test it is necessary to reverse 
the signs of the shifts observed in one group before ranking the individual percent- 
ages. An evaluation of the percentages in each of the scoring categories using this 
technique revealed two significant differences. A greater percentage of anatomy 
responses was produced by the individual administration (p <.001), as was a greater 
percentage of responses to cards VIII, 1X, X (p<.01). 

Because of the limitations imposed by the fact of variance heterogeneity which 
prevented the use of analysis of variance, and because of a confounding of the ex- 
aminer-examinee variable in our non-parametric analysis, it was not possible to 
evaluate statistically the effects of examiner differences. 


DIscUSSION 


In the main, the results support our first hypothesis to the effect that the same 
frequency distributions are obtained by a group presentation of the Rorschach 
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plates as obtained from an individual presentation. Two exceptions to this general- 
ization were noted: A greater number of anatomy Fs, and on the last three cards a 
greater percentage of Rs. Both of these changes took place under conditions of in- 
dividual administration. The increase in anatomy Fs probably is explained by the 
fact that two attractive young women administered the test to college males. The 
increase in / percentages on the last three cards is difficult to explain. Since these 
cards are all color cards, one would expect some significant change in =C, or in- 
dividual color frequencies brought about by type of administration. However, such 
changes did not occur. One can speculate on the possibility of a shifting pattern of 
color response frequencies as a function of type of administration, but if such shifts 
take place, they were too subtle to be teased out"from our data. Hertzman®? in 
using an “equivalent groups” comparison of the individual and group method of 
Rorschach card presentation also found a statistically significantly higher percent- 
age of Rs on the last three cards under conditions of individual administration. 
Coupled with our findings, it appears that the increase cannot be attributed to a rare 
statistical occurrence, but is due to type of presentation. Taking into account these 
two exceptions, this finding indicates that the norms reported in our previous study 
®) based on Ns of 1,000 and 374 respectively, can be used safely as aids in the inter- 
pretation of Rorschach protocols obtained through individual administration. 

It should be made explicit that two important kinds of data are lost through 
the group administration; 7.e., reaction times and subjective clinical impressions form- 
ed by the administrator during the course of individual test administration. How- 
ever, if one is using the test results for strictly scientific purposes, only the first of 
these losses are important and such a loss may be offset by the decrease in response 
distortions due to the less intimate interpersonal situation holding under conditions 
of group administration. While we were unable to make a statistical evaluation of 
the effects of examiner differences upon the R protocols (a test related to our second 
hypothesis), the demonstration by Lord‘ and the Air Force“? that such shifts do 
occur reinforces our interpretation of the increased percentage of anatomy res- 
ponses as being a resultant of more intimate interpersonal inter-actions taking place 
under conditions of individual administration. 

Our third hypothesis was also supported by our findings, in that the second 
administration produced significant shifts in frequencies of 12 scoring categories. 
Essentially, the shifts noted were ones which, on second administration, reflect a 
relatively more meticulous, controlled and intellectualized exploration of the stim- 
ulus materials (increased D, Dd, Hd, Ad, At, F%, R, with a decrease in W%, and 
P%; increased S accompanied by decreased c and FC). 

Previous findings on the effects of a second administration of the Rorschach 
cards show the following agreement with our findings. Harrower-Erickson and 
Steiner“) report an increase in D, Dd, and R and also a “change” (sign not indi- 
cated) in F°,. Lord? reported a decrease in W% but no change in R. 

Lindner“ reports that under conditions of group administration, there tends 
to be an under-emphasis on color responses and a greater productivity of M res- 
ponses. Our results show a trend towards greater productivity in the M category 
under conditions of group administration, but the shift was not significant statis- 
tically. We found no such trend as mentioned by Lindner for =C or for M:=C ratio. 


SUMMARY 


This study was designed to test three hypotheses: (a) Protocols obtained by a 
group administration of the Rorschach plates produce essentially the same fre- 
quency distribution in the various scoring categories as would be obtained by in- 
dividual administration of the cards; (b) Group administration of the Rorschach 
plates results in a protocol that more validly reflects an individual’s psychological 
dynamics since distortions due to examiner-examinee interactions are minimized 
under conditions of group administration; (c) Successive presentations of the Ror- 
schach cards produce significantly different changes in types of responses given by 
superior adult Ss. 
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Analysis of the results revealed only two significant changes in frequencies in 
scoring categories when the group and individual protocols were compared. The 
individual administration resulted in a greater percentage of anatomy responses and 
a greater response production on cards VIII, IX, X. 

Our second hypothesis could not be tested due to a failure of the data to satisfy 
assumptions underlying the application of parametric statistics, and this variable 
was confounded when appropriate non-parametric statistics were applied. 

Our third hypothesis was supported in that 12 significant shifts took place. The 
shifts reflected a relatively more meticulous, controlled, and intellectualized explora- 
tion of the stimulus materials on second administration. 
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SPIRAL AFTEREFFECT AS A TEST OF ORGANIC BRAIN DAMAGE 
ARTHUR J. GALLESE, JR.! 
State Hospital, St. Peter, Minnesota 


PROBLEM 


Inability to perceive a negative aftereffect following rotation of an Archimedes 
spiral was significantly related to diagnosis of organic brain damage according to 
Price and Deabler who found that their group of adult normal males and nonorganic 
psychiatric patients were “‘. . . able to perceive the aftereffect to a marked degree” 
{l, p. 300), while marked impairment of this ability seemed to be demonstrated by 
their group of mixed organic cases with known cortical involvement. The only other 
reported clinical use of this technique was by Freeman and Josey®, who demon- 
strated evidence of correlation between the visual phenomenon and memory im- 
pairment. Standlee“ seemed to obtain contrary results, but an apparently satis- 
factory criticism of Standlee’s work was given by Price and Deabler. 

In view of the paucity of clinical literature pertaining to this phenomenon, it 
was thought advisable to attempt an independent validation of the usefulness of this 
technique for detecting organic brain damage. This study also hopes to provide re- 


‘The author wishes to express his appreciation to Wendell Swenson, clinical psychologist at the 
St. Peter State Hospital, for conducting the retest examinations and for offering valuable assistance in 
writing this paper. 
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liability data, information on the performance of various subgroups of brain-damaged 
patients, and an investigation of the relationship between test scores and age, sex, 
and length of hospitalization. 


MeETHOD 


Two six inch disks painted with black Archimedes spirals of 244 turns on a 
white field were used. Spiral A evolved in a clockwise fashion from the center, while 
Spiral B evolved counterclockwise. A specially mounted phonograph turntable 
produced the necessary rotation of about 90 revolutions per minute. Each S was 
seated about 8 ft. from the apparatus. No attempt was made to use the same test- 
ing room nor to keep the illumination constant, for within broad limits these fact- 
ors appeared not to affect performance. Each S was asked, however, to describe 
the spiral before the test to assure the examiner of S’s ability to see the apparatus. 
Preceding the test S was told, ‘‘This is a special eye test. Look at the center, (E 
points) and don’t take your eyes away until I tell you to.” Spiral A, giving normal 
S’s a negative aftereffect of expansion, was always presented first for 30 seconds 
rotation. After 10 seconds of motion 8 was asked, ‘‘ What does it appear to be doing?” 
If 8S indicated that the spiral seemed to be contracting or going away from him, no 
further questions were asked until the end of the rotation. If the answer did not in- 
dicate that S was experiencing the usual illusion, E asked, ‘‘Does it seem to be chang- 
ing in any way?” If the answer was again unsatisfactory, E asked, ‘“‘Does it seem to 
be getting bigger or smaller, or going away from you or coming toward you, or any- 
thing else?’’ At the end of 30 seconds the spiral was stopped and E asked, ‘““Now 
what does it appear to be doing?” Any answer which might have been construed to 
indicate sensitivity to the normal phenomenon was scored 1; otherwise, the other two 
questions were repeated in order. If the normal aftereffect was indicated, S received 
a score of 1 for the trial regardless of the stage of inquiry at which it was reported; 
otherwise he received a score of 0 for the trial. Next, spiral B was presented, then 
A again and B again. The same inquiry was given with each trial. The total possible 
score for a given § for the four trials was 4, 3, 2, 1, or 0. 

Examples of scorable responses are as follows: ‘‘coming toward me,” “getting 
larger,” “retracting,” ‘‘radiating out,” “it’s a snake crawling out of a box,” “reducing 
from the angle of the vertex,” ‘‘fading away,” ‘‘spreading.” In general, any response 
which would indicate that a negative aftereffect of expansion or contraction was 
experienced with the appropriate spiral was scored. This procedure departs consid- 
erably from Price and Deabler’s, but it was thought advisable in order to reduce the 
likelihood that reticence or difficulties in strictly verbal report would contribute to 
low scores. The author tested all 8’s except during the reliability study. 


Subjects. All S’s for this study were obtained from the employee and patient popula- 
tions of the St. Peter State Hospital, a state mental institution with approximately 
2500 patients accepting all types of mental disorder, and classified as follows: 

1. Thirty normal adults employed at the hospital were used. No attempt 
to rule out those who might have had a history of organic brain damage was 
made. The median age was 38 years with a range from 19 to 67. 

2. Of 54 schizophrenics selected at random from diagnosis cards, 41 were 
testable. The others were either uncooperative, had visual defects, presented a 
word salad or were otherwise unsuitable. Most diagnoses were made on ad- 
mission to the hospital. There were 26 diagnosed paranoid schizophrenia; the 
others consisted of simple, catatonic and hebephrenic schizophrenias. No 
patients were undergoing electroconvulsive treatment. The median age was 43 
years with a range from 20 to 61. The median length of continuous hospitaliza- 
tion since last admission was 33 months with a range from 1 to 263 months. 

3. There were 166 patients in the hospital who were under 65 years of 
age and who bore a diagnosis of acute or chronic brain syndrome with whatever 
qualification. Of these, 97 were testable. The median age was 51 years with a 
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range from 22 to 64. The median length of hospitalization since last admission 
was 65 months with a range from 2 to 344 months. 

4. Of 21 hospitalized, lobotomized schizophrenics, 12 were testable. 
They ranged in age from 27 to 56 years, and had been continuously hospitalized 


from 67 to 319 months. The lobotomies had been performed from two to six 
years previous to this study. 


RESULTS 


A pilot study involving 20 organics and 10 normals indicated that dichotomous 
scoring into test vrganic’’ and test ‘“‘normal’’ groups would be most satisfactory for 
discriminating purposes. It was decided to cut between scores of 2 and 3, such that 
scores of 2 and lower would be called organic, and scores of 3 and higher would be 
called normal. It was also observed that those organics diagnosed as acute and 
chronic brain syndromes associated with alcohol intoxication (N =20) and with 
idiopathic convulsive disorder (N =30) tended to perform very much more like the 
normals than like the other organics. It was then decided to split the organics into 
two groups: group A, which consisted of all diagnoses other than the alcohol and 
convulsive disorders, and group B, which consisted of the alcohol and convulsive 
diagnoses. In group A were 24 CNS syphilitics, 7 with CNS pathology associated 
with circulatory disturbance, 7 with CNS pathology associated with diseases of un- 
known or unspecified cause, 5 with epidemic encephalitis, and 4 with senile and 
presenile brain disease. 

The Chi square technique was used to determine the significance of the relation- 
ship between the test scores and the following factors: age, sex, and length of hospital- 
ization since last admission. No relationship was found for any of these variables 
(p greater than .10 in all instances). 

Chi square was again used as a test of the differences between the groups. The 
following comparisons were made: 


Normals vs. schizophrenics, p greater than .30. 

Normals and schizophrenics vs. all organics, p less than .0005. 
Normals and schizophrenics vs. organic group A, p less than .0005. 
Normals and schizophrenics vs. organic group B, p less than .0005. 
Organic group A vs. organic group B, p less than .0005. 


None of the lobotomized schizophrenics were classified organic by the test.’ Using 
the proposed cutting scores, a considerable percentage of the organics are’ identified 
with very little misclassification of nonorganics. See Tables 1 and 2. 


TABLE 1. Sprrat Arrer Errect Scores 


Test Score 


Subject Group caer ee eee 
Organie | Normal 


Normals 0 30 
(N =30) | 0% | 100% 
Schizophrenics 2 39 

(N =41) 5% 95% 


Organic Group A 31 
N =47) | 66% 


Organic Group B 14 
(N =50) 28% 





Lobotomized Schizophrenics 0 
(N =12) 0% 
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TABLE 2. CLASSIFICATION BY SprRAL ScoRE 


Test Classification 


Subject Group | Correct Incorrect 


Nonorganics 
(Normals and 
Schizophrenics) 





Organic Group A | 66% 


Organic Group B 


All Organics 46% 


28% 








TaBLe 3. CHANGE OF Scores From First To Seconp TESTING 
Witu DirreRENT EXAMINERS 


| 
| Second Testing 
First Testing Score 





Score | Organic | Normal 


Normal 0 | 15 | 15 








Organic 16 oe 19 





Total | 16 | 18 | 34 








| 
| 





Reliability. A second psychologist who had hitherto been unfamiliar with the tech- 
nique of administration or scoring was asked to retest a number of persons randomly 
selected from the organic groups. A short explanation of the manner of procedure 
and the principles of scoring was given. This examiner was not aware of the original 
scores of the patients. In all, 34 organic patients were retested at intervals of from 
two to three months following the original testing. Results are presented in Table 3. 
Using Chi square to determine the significance of change from the first testing to 
the second testing, p greater than .20 was obtained. The fourfold point r between 
first and second testing was .84; thus indicating good reliability in scores obtained 
in_a test-retest situation using different examiners. 


DISCUSSION 


This study agrees with Price and Deabler that the spiral test is of value in differ- 
entiating between patients with organic brain damage and those without. However, 
it would appear that not all varieties of organicity are equally differentiated. It is 
observed that the lobotomized patients and those with damage associated with 
alcohol and convulsive disorder did not perform as did the other organics. In fact, 
lobotomized patients were not differentiated from the normals and schizophrenics 
to any degree. Hence, cortical involvement alone is apparently not sufficient for 
detection by this test. To be sure, the differences between Price and Deabler on one 
hand, and this study on the other, in methods of selection of criterion groups, statisti- 
cal analysis, and procedure during the test proper tend to lessen the comparability 
of the results. The emphasis in this study was on the lessening of the probability of 
obtaining low or organic scores because of difficulties other than perceptual. Were 
the detailed, direct inquiry not performed, many more schizophrenics would have 
obtained organic scores. However, with this method of inquiry and of scoring, the 
test almost always indicates organicity when organic scores are obtained, although 
the converse is not true. It is to be emphasized that the decision to use dichotomous 
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scoring and to subdivide the organics into two groups was made prior to the study 
proper. The value of following these procedures decided upon at that time is made 
obvious by the subsequent results. 

Some clinical observations regarding the patients’ behaviors are worthy of men- 
tion. It was surprising to the examiners that many patients who bore organic diag- 
noses but who seemed to possess clear sensoriums and gave no gross, outward mani- 
festations of brain damage obtained low scores. On the other hand, several patients 
who were patently confused, disoriented, and showed other signs of organic path- 
ology obtained high scores. It is also to be noted that the two schizophrenics who 
obtained organic scores were the only two in the study who were suspected, prior to 
the test, of having brain damage. This suspicion has not as yet been confirmed, 
however. It is also the author’s belief that among the organics who obtained high 
scores the duration of the negative aftereffect was considerably less than among the 
nonorganics, but this observation was not subjected to test. 


SUMMARY 


1. The spiral aftereffect test has been demonstrated to significantly differen- 
tiate between state hospital patients who bear a diagnosis of organic brain damage 
and those who are normal or diagnosed as schizophrenic. 


2. The test is apparently more sensitive to some subgroups of organics than to 


others. In this study, those with acute or chronic brain syndromes associated with 
alcohol or convulsions (idiopathic) tended to score more like nonorganic persons 
than did other organics. 


3. When those with organicity associated with alcohol or convulsions were 
excluded, 66° of the organics in this study were identified by the test, whereas only 
3° of the nonorganics were misclassified. 


4. Normals, schizophrenics, and lobotomized schizophrenics were indisting- 
uishable from each other in terms of their scores on this test. 


5. Scores on the spiral test were unrelated to age, sex, or length of hospitaliza- 
tion since last admission. 


6. Test-retest and inter-tester reliabilities were demonstrated to be adequate. 
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THE RORSCHACH AS A PHYSIOLOGICAL STRESS* 
HUDSON JOST AND LEON J. EPSTEIN 


University of Georgia Veterans Administration Hospital, 
Brooklyn, N. Y 


PROBLEM 


This study investigates the Rorschach examination as a stress producing situa- 
tion. Its unstructured character is sufficient to suggest that it may be stressful to the 
subjects studied. In order to quantify the degree of stressfulness, physiological meas- 
urements were obtained at the time of the free association portion of the examina- 
tion. A search of the literature indicated that only a few studies had used this ap- 
proach and these either used one measure of physiological response, namely the gal- 
vanic skin response (GSR), or did not use simultaneous measurements. 

Frost and Rodnick “? examined normal and schizophrenic subjects in an effort 
to discover the relationship between certain Rorschach determinants and concomi- 
tant GSR activity. Normal subjects showed significantly larger deflections with 
form-dominant responses than did the schizophrenic subjects. The authors conclud- 
ed that the schizophrenics seem to have an altered capacity to control and modify 
their reaction to reality. 

Rockwell, Welch, Kubis, and Fisichelli® studied the relationship between the 
GSR and “color shock” on the Rorschach test. Continuous graphic recordings were 
made of variations in palmar skin resistance during the administration of the Ror- 
schach test to both normal and clinically psychoneurotic subjects. The clinical 
group presented fewer responses as well as a lower mean level of GSR. 

Levy “ examined fifty male college students individually and their maximum 
galvanic deflection per card was measured. There was no significant increase in skin 
conductance accompanying the colored plates and Levy, therefore, concluded that 
Rorschach cards do not differ among themselves in affective value. Oddly enough, 
however, Levy found for her sample a marked change in skin conductance associated 
with the position of the card in the series. When the Rorschach cards were systemat- 
ically presented in a rotated order, the eighth in the series, no matter which card it 
happened to be, was accompanied by galvanic changes that were significant beyond 
the one per cent confidence level. 

Two other studies compared Rorschach responses with physiological changes 
measured in other settings. Brower? correlated certain Rorschach categories from 
the protocols of subjects selected at random from a college undergraduate popula- 
tion, with measurements of diastolic blood pressure, pulse pressure, and heart rate 
which were taken before and after a state of visuo-motor conflict. Such a conflict 
was induced by means of a mirror-drawing test, under the conditions of direct vision, 
mirror vision, and the blindfolded state. He found that higher pulse rate prior to the 
experiment and greater pulse pressure after the experiment tend to accompany per- 
sonality traits of constrictiveness and repressiveness. He concluded that cardiac 
excitation has some concomitant relation with those personality trends indicated 
by “F” in the Rorschach test. Furthermore, it appeared that subjects showing lower 
levels of adjustment to reality, as measured by the ‘FC’ Rorschach category, in 
Brower’s work, exhibited higher diastolic blood pressure prior to the experiment. 

Lacey et al.) subjected their subjects to four different stresses and related 
these findings to Rorschach examinations on twenty-six of the eighty-five subjects. 
In this study ‘‘an attempt was made to validate the Rorschach Form-Color Index 
of emotionality against the criterion of autonomic response to experimentally in- 
duced stress.’’ No clear relationship was found here, but when the response specific- 
ity of the individuals was evaluated against the Rorschach a significant relationship 
was found. 


*Acknowledgements to Dr. T. S. Hill, Department of Psychiatry, University of Tennessee College 
of Medicine, Memphis, Tennessee, for his interest and helpful suggestions. 
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METHOD 


In this study a Keeler Polygraph, Model 302, was used to obtain the physiologi- 
cal measurements. The subjects were thirty-two students from the Division of Medi- 
cal Sciences of the University of Tennessee, selected by choosing every second person 
from an alphabetical list. Fifteen of the subjects were student nurses and seventeen 
were medical students, the total group ranging in age from twenty to thirty-five 
years. The Rorschach test was administered individually, by qualified examiners, 
in accordance with the technique suggested by Beck“. The testing situation was 
entirely new to every subject. The examinations were conducted without interrup- 
tion in a quiet and private office with none present other than the subject and two 
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examiners, one to operate the polygraph not in view of the patient and one to ad- 
minister the Rorschach test. All examinations were given in the morning. The light 
source in every case was daylight plus fluorescent lighting. 

The blood pressure cuff was placed on the subject’s right leg, just below the 
knee, and the galvanic electrodes were placed on the left foot. These devices were 
placed on the legs rather than on the arms, as is the customary procedure, in order 
to have the subject’s hands entirely free for normal handling of the Rorschach cards. 
At regular intervals, after Rorschach cards III and VI, the polygraph was stopped 
and pressure reduced in the cuff in order that good circulation might be restored in 
the subjects’ leg. 

At the beginning of each examination, after adjusting the polygraph, a control 
run of two minutes duration was made to determine the initial heart rate (HR), 
respiration (R), relative blood pressure (BP), and change of GSR of the subject in 
the resting state. During the administration of the test, the experimenter operating 
the polygraph pressed the stimulus marker whenever the subject made a response, 
and made a written note on the moving record in order that the polygraph recording 
could be matched with the Rorschach protocol. A final control run was taken at the 
end of the Rorschach free association period. No recordings were made on the poly- 
graph during the inquiry. 


RESULTS 


The results of the study are summarized in Figures 1 through 8. Figures 1 and 
2 present the changes in R pattern found during the control period (C) and the 10 
Rorschach cards. It should be noted in Fig. 1 that the average R frequency before 
the examination began, for this group of subjects, was 14.5 respiratory movements 
per minute. This rate increased to 19 per minute on Card I and then showed a grad- 
ual decrease in frequency throughout the series of cards and at Card X the frequency 
had returned to 16.3 respiratory movements per minute. Also presented in Fig. 1 
are the R frequencies both before and after the presentation of the card. Here it is 
obvious that the card, while the subject was looking at it, had a facilitory effect on 
the R rate and there was a marked recovery after the card was placed on the table. 
In all cards there was an increase in R frequency with recovery when the card was 
finally returned to the examiner. Another general patterning of R is shown clearly 
in Fig. 2. This is a measurement of R in terms of the amplitude of the R movement. 
It should be noted that there was a slight increase in the amplitude with card II and 
then a gradual decrease in the amplitude of the R movement throughout the ten 
cards. No one card seemed to have a marked effect on either the R frequency or 
amplitude. 

The results of the heart rate measurement are represented in Fig. 3; it should be 
noted that the Rorschach cards had a marked effect on the frequency of the HR and 
in all cases the frequency was greater during the time the subject held the card than 
either before or after the presentation. The HR frequency in the first period (C) for 
this group of subjects was 88 per minute. This increased to almost 99 per minute on 
Card I and, as with R, there was a continual recovery of HR during the examination 
and with Card X the frequency of HR had returned to a level below that of the 
control period. Both the measurements of HR and R indicate an adjustment pro- 
cess occurring in these subjects during the course of the examination. 

The third major measurement obtained during this study was that of the GSR. 
The results of this are shown in Figures 4, 5, and 6. In Fig. 4 the measurements of 
the initial and final resistance for each card are graphically portrayed. It should be 
noted that at the beginning of the study the average resistance of the group was 
68,000 ohms, which decreased to 65,000 ohms during the control period. This may 
be interpreted as increased anxiety during this basic control period before the pre- 
sentation of the cards. The general pattern of the subjects during the examination 
is again seen in this measurement. The resistance of the subjects dropped with Card 
I, further with Card II, followed by a general recovery pattern through Card X at 
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which time the resistance level was only slightly below that of the earlier control 
period. Fig. 5 presents the data on the initial GSR change when the card was first 
presented to the subject. This is a delicate measurement and we observe on Cards I, 
IV, and VII an increased response. This is undoubtedly due to an artifact in the 
experimental design; it was necessary to discontinue the study temporarily after 
Cards III and VI in order to keep the discomfort of the blood pressure cuff at a min- 
imum. It is a feeling of these workers that the increased responses on IV and VII are 
the result of this experimental artifact. It should also be noted that in the early 
cards the response was greatest, and as the examination progressed the subject ad- 
justed to the situation and the size of this initial galvanic response diminished. This 
same pattern of adjustment is seen in Fig. 6, which is a measurement of the recovery 
of the subjects after the initial shock of the cards. The smaller numbers indicate less 
recovery than the larger ones. On Card I there is less than a 10 per cent recovery in 
the GSR response within the first 15 seconds. There is a gradual progression until 
Card X, where there is a 60 per cent recovery. Here again we find on Cards IV and 
VII the artifact which was mentioned earlier. 

Fig. 7 presents the total BP change during the time the subjects held the card. 
Here again, unfortunately, the artifacts of the experimental design are shown to be 
masking the total pattern of BP responses. The material is presented here merely 
to show that the blood pressure response is a delicate one and when the study is re- 
peated the experimental artifact of this study should be removed. We believe that 
the GSR and the BP reactions are the most delicate of those studied. Fig. 8 shows 
the BP changes in another way. Here the BP change during the first 15 seconds is 
shown. This indicates a little better the gradual adjustment made by the subjects 
during the study. 


DISCUSSION 


Insofar as the physiological measurements used in this study indicate the degree 
of stress imposed by the Rorschach examination, the test is a stressful one. It also 
shows that control (nonclinical) subjects are able to adapt to the stress to a marked 
degree during the testing period. (Some unpublished material indicates that this is 
not true of some clinical cases.) 

The measurements of R and HR were found to be the most stable in this group 
studied. Both of these show a relatively straight line recovery from Card I through 
Card X. They were relatively unaffected by the experimental artifact of the study 
(stopping the polygraph after Cards III and VI). The facilitory effect of looking at 
and responding to the cards is readily seen in the different levels of physiological 
activity when the before and after measurements are compared to the ones obtained 
when the cards were held. 

Unfortunately the measurements of GSR and BP do not give as clear a picture 
of the adjustment during the test situation as do the ones mentioned above, however 
the general trend toward less responsiveness (adjustment) is seen. Cards IV and 
VII appear to be more exciting than the others when these measurements are used as 
criteria. This is not the case; the measurements of GSR and BP are the most delicate 
used in this study, and they show the greater response of the subjects after the rest 
periods. This study tends to confirm the earlier studies using physiological measure- 
ments, in that none of the Rorschach cards were found to be more exciting than others 
to a control group. The adaptation of the group to the examination is clearly seen in 
the curves of physiological reactivity shown in Figures 1 through 8. 


SUMMARY 


Simultaneous measurements of respiration, heart rate, relative blood pressure 
change and galvanic skin resistance levels and responses were obtained during the 
free association portion of the Rorschach examination on 32 nonclinical subjects 
consisting of medical and nursing students. The findings indicate that the Rorschach 
examination is a stressful one as reflected in the physiological changes associated with 
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its administration. The subjects showed a marked increase in the reactivity as re- 
flected in the measurements with Cards I and II and then a gradual adjustment 
through Card X. None of the cards seemed to have specific exciting value for this 
group. 
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THE CORNELL MEDICAL INDEX IN A PSYCHIATRIC OUTPATIENT 


CLINIC 


FRANKLYN N. ARNHOFF, PH.D., LA VERN C. STROUGH, M. D. 
AND RICHARD B. SEYMOUR, PH.D. 


Nebraska Psychiatric Institute and the University of Nebraska College of Medicine 


PROBLEM 


The Cornell Medical Index (CMI) is a 195 item questionnaire consisting of 144 
general medical and 51 psychiatric questions to which the patient indicates his agree- 
ment or disagreement by circling the appropriate ‘‘yes’”’ or “no” after each ques- 
tion. 7). The questions are grouped according to types of systemic complaints for the 
medical items and by moods, attitudes, and behavior for the psychiatric items, form- 
ing a total of 18 scales (See figure 1). The authors of the questionnaire have reported 
considerable research with the CMI“: ? % 4 5 & 7, 19 and feel that it has decided 
value in both medical and psychiatric settings. It was the present authors’ intent to 
investigate this instrument with a fairly well defined psychiatric population and to 
subject the results to rigorous statistical analyses for evaluation. Specifically, we 
wished to investigate possible differences between male and female patients in their 
types of complaints and to ascertain the validity and usefulness of past findings for 
our sample. 


PROCEDURE 


For a period of one year at the Nebraska Psychiatric Institute Outpatient Ser- 
vice and the University of Nebraska College of Medicine Psychiatric Dispensary, 
all patients were initially required to fill out the CMI. From this total sample we 
discarded all cases of mental deficiency and diagnosed physical illnesses, which netted 
a final group of 101 adult white subjects: 45 males and 56 females. The male and 
female patient groups were found to be approximately equated for age (males: mean 
37.5, range 16-73; females: mean 34.3, range 17-70) and education (males: mean 11.1, 
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range 6-16 years; females: mean 10.3, range 6-16) with no obvious group differences 
in socio-economic background or status. The psychiatric diagnostic distribution for 
the two groups is essentially the same, as shown in Table 1. For comparative pur- 
poses, age and educational and psychiatric information are not available from pre- 
vious studies so the comparability of samples remains unknown. 


TaBLe 1. C. M. I. Diacnostic Catecory TABULATION 





Classification 


Male Female 


Psychotic Reactions: 
1. Organic 
2. Psychogenic 
Psychophysiological Autonomic and Visceral 
Psychoneurotic Disorders 
Personality Pattern Disturbances 
Personality Trait Disturbances 
Sociopathic Personality Disturbance 
Transient Situational Personality Disorder 
Unclassified 


Total 








LESULTS 

While the CMI can and has been used clinically by a mere scanning of the sub- 
ject’s ‘“‘yes’’ responses, previous studies have indicated statistical criteria“: 4 & 
upon which diagnostic evaluations can be made and by which patients suffering from 
emotional ills can be recognized. The previous investigators report that a cutting 
score of 30 or more ‘‘yes” responses on the CMI enabled them to correctly identify 
up to 76°; of the psychiatric patients seen“. While they report comparisons with 
‘normals’ and nonpsychiatric samples they arbitrarily divided their emotionally ill 
group into two classifications of “‘neurotic’”’ and V. A. psychiatric which make com- 
parisons difficult. Furthermore, no statistical evaluations of the differences between 
their groups are reported. Using the criterion score of 30 or more yes responses, we 
were able to identify correctly 69°7 of the males and 79°; of the females in this 
known psychiatric patient sample. Using a cutting score of 50 or more “‘yes’’ res- 
ponses, as was done in another reported study on military neuropsychiatric screen- 
ing‘), we were only able to identify 36°7 of the males and 52°, of the females. As 
ours was a known, diagnosed psychiatric group, this would tend to seriously question 
the advisability or reliability of screening procedures with such a cutting point. The 
ultimate test of cutting scores per se, however, awaits further research in which well 
defined populations of normals, as well as medical and psychiatric patients, are com- 
pared by appropriate sampling and statistical methods. The cumulative percentage 
distribution of the number of “yes’’ responses by patients in the present study is 
shown in Table 2. 


Tas iE 2. CUMULATIVE PERCENTAGE OF PATIENTS’ NUMBER OF “YES’’ RESPONSES 





Number of Responses | Males (N = 45) 





10 or more 96 95 

a” - } 82 89 

*30 69 79 
40 58 59 

*50 36 52 
60 | 27 45 
70 13 30 
| 7 23 

| 


80 
90 “ 1 20 





*Suggested critical scores used by previous authors. 
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In conjunction with the absolute number of “‘yes’’ responses by the patient, 
additional criteria of emotional disturbance manifest on the CMI have been ad- 
vanced. In the CMI manual? the authors state that the patient with emotional in- 
volvement tends to scatter his ‘yes’ responses over the entire record which, there- 
fore, represents diffuse complaints, not specific to any one or two organ systems. This 
was substantiated in our sample with the mean number of scales responded to with 
one or more “‘yes”’ responses as follows: Men: mean number of scales 13.3, standard 
deviation 3.92; females: mean number of scales, 14.2, standard deviation 4.04. The 
differences between the respective means and standard deviations were not signifi- 
cant. 

Further “presumptive evidence on the CMI of the patient having an emotional 
disturbance,” “: P- %) was reported as: 

1. Three or more questions answered both “yes” and “no.” 
2. Omitted answers from six or more questions. 
3. Three or more remarks or changes in the questions. 


In the present study none of the above three criteria appeared with enough fre- 
quency to warrant their acceptance as being useful. Out of the sample of 101 patients 
4 answered 3 or more questions both ‘‘yes” and “‘no’’, 11 omitted 6 or more questions 
(9 were women) and only 15 (about equally divided between sexes) made 3 or more 
remarks or changes in the questions. 


Analysis of Psychiatric Scales. The psychiatric scales were analyzed to determine 
possible differences in complaints between the sexes, and between the individual 


Fic. 1. Sex DIFFERENCES ON MEDICAL AND Psycuiatric C. M. I. ScALES 


MEDICAL PSYCHIATRIC 





o— 
H corrected scale H (GU) 
easa:9 ” . 














ABC DEF GHIJISK LMNOP QR 
CMI. SCALES 


A....Eyes and Ears G....Nervous System M....Inadequacy 
B....Respiratory H....Genitourinary N....Depression 
C....Cardiovascular .....Fatigability .... Anxiety 
D.... Digestive J....Frequency of illness Sensitivity 
E.....Musculoskeletal K....Misc. Diseases 

F....Skin L....Habits .... Tension 
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scales themselves for both sexes. The scale by scale comparisons were complicated 
by the difference in the number of items in each scale, so that it was necessary to 
convert the data to mean number of responses per person per item, which effectively 
equated the scales for length and permitted appropriate statistical tests... The 
graphic representation of the distribution of responses to each of the scales for both 
sexes (Figure 1) is reported in these terms. The relative values, however, remain the 
same and, hence, can be meaningfully evaluated on the graph. 

No significant differences were demonstrated between the sexes in the number 
of complaints on any of the psychiatric scales. Although the tendency here was for 
males to give more complaints than females on some scales, the differences were not 
statistically significant. Comparison of the individual se ales with each other,’ re- 
vealed no scale differences for females but some significant differences for males, with 
Sensitivity the most frequently responded to scale. For the sexes combined Sensi- 
tivity and Anxiety were found to be significantly higher in number of complaints 
than the two lowest scales of Inadequacy and Tension. The other two psychiatric 
scales, Depression and Anger, were between these two extremes and were not signifi- 
cantly different from the others. 


Medical Scales. As has been reported previously,“ 5 7 there is a tendency for 
females to report more complaints than males, although for the most part the differ- 
ences were not statistically significant in the present study. Our analysis of the med- 
ical scales followed the same procedures as previously mentioned, with all scales 
equated for number of items and over-all analyses made by analysis of variance with 
results reported at p: .05 or better.’ 

Sex differences were demonstrated significantly on only 4 out of the total 18 
scales.‘ These scales were: Cardiovascular (p. < .01), Genito-urinary (p. < .01), 

“yes and Ears (p. < .05) and Miscellaneous diseases (p. < .05). The variability of 
the ‘‘yes’”’ responses was also found to differ significantly for these same scales. In all 
instances, significant differences are in the direction of more complaints for the 
females. When the Genito-urinary scale was corrected for comparative purposes by 
removing the items which were different for the sexes, as they referred to specific 
individual sex problems such as menstruation, etc., no significant differences in the 
number of complaints was found. 

As significant scale differences were demonstrated, these scales were sub- 
jected to an item analysis to determine which items were mainly responsible.' 
Females were found to have more complaints of difficulty in breathing, swollen 
ankles, cold hands, or feet (all cardiovascular), kidney or bladder disease (genito- 
urinary), and anemia, overweight, varicose veins and tumor or cancer (all from mis- 
cellaneous diseases). As no item differences were demonstrable on the Eyes and Ears 
scale, the demonstrated scale difference probably represents a compounding of 
chance variations. Although the genito-urinary scale was not significantly different 
after correction, one of the items was, as mentioned above. The authors wish to 
make it clear that these item analyses were carried out only for scales demonstrating 


“ihe over-all score comparisons were made with analysis of variance.) Because of a demon- 
strated correlation between means and variances, and nonnormality of the distributions, it was neces- 
sary to transform the data in order to use validly this statistical method. The angular transformation 
was used: 4) which effectively reduced the correlation to 0 and had the further desired effect of more 
nearly normalizing the distributions. Results are reported at p: .01 or better. 

*Scale by scale comparisons for significance of the differences were made with Duncan’s multiple 
range test ‘*) with significance reported at p: .05 or better. 

3A comparison was made for significance of the difference between means on both medical and 
psychiatric scales using the 2 sample randomization approximation to the t-test?) which found sig- 
nificant differences (p: .01) on medical, but not on psychiatric scales. Evaluation of these findings was 
made by the graph of the number of significant differences expected in a given number of tests made, 
constructed by Sakoda, Cohen and Beale. “*) 

‘To obtain four significant differences at the .05 level out of the 18 comparisons is greater than 
chance (p: .01) expectancy. “*) While we can therefore feel sure that some of these differences are not 
due to chance variation alone, it is impossible to determine which ones are. 

5Item analyses performed by Chi-square, using Yates correction for continuity where needed, “!. 
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significant sex differences and not for all the items on the C. M. I. Hence, it is quite 
conceivable that other items, not tested, might therefore differ significantly. Of all 
the medical scales, Fatigability was responded ‘‘yes’’ to significantly more often than 
any other scale, by both sexes, and the least number of complaints was reported for 
Musculoskeletal (see footnote 2). 

DISCUSSION 

Although the present writers have gone into considerable statistical analyses 
with this data, it is their strong feeling that the usefulness of this instrument in 
clinical application, remains on a non-statistical, individual level, in which the 
patients’ responses are simply noted and serve as the basis for further evaluation 
and/or examination by the practitioner. It is in this manner that we see the C. M. I. 
as making a worthwhile contribution to the clinician, enabling him to obtain neces- 
sary information in an objective, standardized, and economical manner. While 
group findings, as reported, have definite value, it has been our frequent experience 
for patients who will admit to a plethora of complaints in a personal interview and 
who are presenting themselves for treatment at the clinic, to deny all on the C. M. I. 
and they would, therefore, be missed on the basis of statistical or screening use of 
the Index. 

From the data presented here, it is interesting to note that we find our female 
patients complaining significantly more often of tumor or cancer and kidney or blad- 
der disease, despite the absence of physical findings. It is quite possible that such ail- 
ments carry particular emotional connotations to the female and may more often 
serve as focal points for her fears and anxieties. The very frequent complaint of 
fatiguability in both sexes is worthy of note and deserves further examination and 
investigation clinically. While our findings present some pertinent facts, we realize 
that our sample is small and limited in scope with the possibility to consider that our 


findings may not be applicable to other population samples. However, despite these 
limitations, these results do seriously question past statements and statistical cri- 
teria for use of the C. M. I. and would, therefore, indicate far more caution in its 
use, pending further adequate investigation. 


SUMMARY 


The Cornell Medical Index, a self-administered 195 item questionnaire of medi- 
cal and psychiatric items was given to 45 male and 56 female psychiatric outpatients, 
with the two sex groups equated for age, education and type of illness, as well as ap- 
proximate socio-economic background. Analyses revealed few sex differences on the 
medical scales and none on the psychiatric scales. Most significant of the findings 
was the very frequent complaint of fatiguability for both sexes, as well as significantly 
more complaints by the female patients of tumor or cancer and kidney or bladder 
disease, despite the absence of physical findings. Various criteria for use of the 
C. M. I. reported by other investigators were statistically examined and were not 
found valid for our sample. It was the present writers’ conviction that while the 
C. M. I. can offer valuable information in a standardized, economical, and objective 
manner to aid the clinician, its use for screening purposes should be curtailed pend- 
ing further extensive and rigorous investigation. 
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A QUANTITATIVE METHOD OF SCALING COMMUNICATION 
INTERACTION PROCESS! 
HAROLD J. FINE AND CARL N. ZIMET 


Veterans Administration Veterans Administration Hospital 
Bridgeport, Connecticut Albany, New York 


PROBLEM 
In the evaluation of psychotherapeutic and social action progress, it is necessary 


to apply measuring instruments that will accurately reflect behavioral change re- 
sulting from such a program. Recently, in a research project in which modification 
in the communication procedure during group discussion was studied, it was neces- 
sary to construct an instrument to measure the degree and quality of participation in 
the group discussion“. It was further required of the instrument that it reflect all 
those modifications of the communication process which indicated increasing 
warmth of interpersonal reaction and greater self understanding. One of the most 
fruitful systems of quantifying communication was devised by Bales“? in his in- 
vestigation of interaction process. He constructed an observation system of 12 
categories of behavior which could be scored by trained judges. This scoring method 
took into account both verbal and non-verbal participation. Bales’ categories and 
scoring give a characterization of the kinds of social situations being created by a 
group and its members and how they react to the atmosphere they have established. 
The drawbacks of this system, however, lie in the fact that a machine, the Inter- 
action Recorder, is important for se tting up an accurate scoring system, and that 
furthermore, extensive training is required to become proficient in the use of the 
Bales’ system. It was also felt that parts of Bales’ interaction scale were inadequate 
to meet the demands of the theoretical framework of this research, namely, that of 
group-centered psychotherapy. Another method of analyzing interaction was set up 
by Gorlow®). This, however, dealt exclusively with theme analysis by which trans- 
criptions of group psychotherapy sessions could be scored. 


PROCEDURE 
In the present study a participation rating scale was drawn up as an adaptation 
and reformulation of the categories of Bales’ interaction scale and Gorlow’s system of 
theme analy sis. This participation rating scale (hereafter called PRS) was divided 


This paper is based in part on the doctoral dissertation of the senior author. The writers wish 
to express their grateful acknowledgment to Dr. Arthur W. Combs for his help and guidance. 





A QUANTITATIVE METHOD OF SCALING 269 


into 15 categories which ranged from a most permissive, warm and understanding 
category to a category that depicted the most aggressive, unfriendly, tense and 
hostile type of participation in a group. Student-centered classes at Syracuse Uni- 
versity were used to test the efficacy of the prototype scale. Three separate classes 
were used for this pilot study and the degree of congruence among judges scoring the 
verbal participation was checked. The pilot study revealed a mean agreement of 75 
per cent in scoring among 3 independent observers. As a further measure of agree- 
ment, chi square in a contingency table was 17.82 for 10 d.f., which is between the .05 
and .10 confidence levels. In the light of the pilot findings the PRS was reduced in 
length to the final 12 categories, and the use of this instrument raised the congruence 
of agreement among the three observers to a mean of 87 per cent and a chi square of 
18.42 for 9 d.f. This indicated that the agreement among scorers was beyond the .05 
level of confidence. 

The twelve categories were not formulated as discrete entities. Instead the para- 
digm is a straight line where the underlying distribution is a continuous series. At 
the same time the boundaries of one category were conceptualized to be quite flex- 
ible so that the properties could overlap in a dynamic manner into its proximate 
categories. A cluster or grouping of categories was drawn up so that four categories 
of three items each could be elicited. Items 1, 2, and 3 at the upper end of the scale 
were clustered in area 1 which reflected the most positive type of participation by 
therapy participants. Items 4, 5 and 6 were grouped in area 2 which reflected a 
neutral type of participation that had a tendency toward a positive type of inter- 
action. Items 7, 8 and 9 were clustered in area 3 which reflected a neutral or ambi- 
valent type of participation that tended toward the negative end of the scale. Items 
10, 11 and 12 were clustered in area 4 which reflected essentially negative and hostile 
type of participation. The categories are: 


Category I. 1. Expresses warmth, understanding, sympathy; sees others’ 
viewpoints. States own position with dignity and gives logical 
reasons for it. 

2. Expresses simple acceptance of others’ views; “I see, I 
understand.” 
3. Asks open-ended questions or states tentative hypothesis; 
“What do you think of this? How does this strike you? I’ve 
kind of thought ... 1 wonder about. . .” 

Category II. 4. Asks clarifying questions: I don’t see, I don’t quite under- 
stand—I’m not sure I get what you mean.” 

5. “I see what you mean. I know how you feel. Things are 
sometimes like that.” 

6. Expresses approval or encouragement: “That is right”’. 
States problem, seeks clarification. 

Category III. 7. Expresses reassurance, unsupported opinion, states problem, 
gives information. 

8. Expresses counter-opinion: ‘“‘My opinion is .. .”’ “I don’t 
agree, I don’t think that’s a good idea.” 

9. Persuades, suggests, advises: ‘Why don’t you; if I were 
you .. .”’ Confronts with own inconsistency or weight of 
authority. 


Category IV. 10. Interprets, evaluates: “I doubt that, you’re wrong.’ Shows 
self nervous, unstable. 


11. Deflects, changes subject, does not respond to remarks ad- 
dressed to him. 

12. Disapproves, deflates others’ status, hostile, defends self. 
Calls upon other members or group leader for support. 
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The PRs was recorded in the following manner: during each meeting of a group 
therapy-like seminar, each response of a verbal nature by a group member was scored 
by observers trained in the use of the scale. The observers did not participate in the 
meetings themselves. A score sheet was set up to record the responses. Each member 
of the group was given a code number memorized by the observers. Each partici- 
pant’s response, whether of a few words or a lengthy verbalization was considered a 
unit and evaluated in terms of the quality of its theme and this evaluation of the 
interaction was rated on the scale. In evaluating a response the observer must rely 
not only on the response itself, but the inflection and modulation of speech, the 
words that make up a response and the expressive facial and bodily features that 
accompany such a response. Gestures and movements without verbalization do not 
constitute a scorable response. For example, if subject number 8’s opening remark 
was of a hostile, defensive nature, a number 8 would be scored in category 12 under 
A. The code number of the member speaking next would be placed in the appro- 
priate category under B, etc. 


RESULTS 

To test whether the ordering of items from one to twelve was a justifiable one, 
each category was typed on a 3 x 5 card and shuffled at random. Five judges were 
asked to sort the items hierarchically from the most warm, understanding, and per- 
missive to the most hostile and negative category. Average agreement with the 
authors’ criteria for the five judges was 93.6 per cent. 

In using the PRS it was fundamental that observers’ ratings be reliable. Three 
observers, one of them the senior author, scored the protocols during the group meet- 
ings. There were two series of congruence checks for both the individual and the 
clustered categories. The first series consisted of 8 sessions so spaced that there would 
be at least one congruence measure by two observers during each of the quarters of 
the group meetings. The second series consisted of three additional ratings with an- 
other observer trained in the scoring technique (Table 1). Chi square analyses for 
each of the 11 congruity checks were between the .05 and .01 levels of confidence. 
Not only should the degree of agreement among scorers serve as an external criterion 
of the congruence, but the question may be raised about a repeated analysis by the 
same researcher after a passage of time. Accordingly two 60-minute informal group 
meetings were electronically recorded using a different population from that of the 
original study. The interaction in these two sessions was scored by one of the authors 
at the time of the group meetings. Three months later the electronic recordings were 
played and rescored. The mean agreement was 76.6 per cent for discrete items and 
87.35 per cent for clustered items. The lower congruence is likely to have been a 
function of the inability to recapitulate the expressive gestures and inflections. Chi 
square analysis for each was beyond the .05 level of confidence. 


TABLE 1. Per Cent oF AGREEMENT BETWEEN OBSERVERS ON PARTICIPATION RATING SCALE WITH 
AvutTuor’s Scores Usep as STANDARD 
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A QUANTITATIVE METHOD OF SCALING 


CONCLUSIONS 

The greatest problem of research in social action processes and group and in- 
dividual psychotherapy has been the quantification and statistical manipulation of 
data meaningful to the process of personality change under therapy. In this paper a 
scale has been described that was utilized in a research project in which a group of 
school administrators underwent a therapeutic experience. Through this method an 
attempt was made to measure the behavioral changes that occurred during a series 
of therapy-like meetings. Some degree of validation had been demonstrated by cor- 
relating the behavior changes noted on the PRS with changes found on projective 
test material ©. 

The method of measurement described is not only a scale that rates the quality 
of interaction, it also identifies the process of communication in a small group setting. 
It attempts to assess the open or closed channels of communication operating within 
a theoretical framework of psychotherapy. However, since it is primarily a scale of 
human interaction, it need not be limited to group psychotherapy but can be utilized 
in a variety of situations in which group discussion serves as a basis for decision 
making. 
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EVALUATIVE STUDY OF ONE HUNDRED TRANSORBITAL 
LEUCOTOMIES 


MATTHEW D. MERMELSTEIN 


Clarinda (Iowa) Mental Health Institute 


PROBLEM 


The application of psychosurgical techniques as a treatment for mental illness 
was first introduced by Moniz in 1935. Since then many different procedures have 
been initiated and attempted with varying degrees of success. This paper reports 
the results of a program carried out at the Clarinda (Iowa) Mental Health Institute, 
covering a period from November 1951 to July 1953 and involving a total of 104 
operations involving Fiamberti’s transorbital leucotomy performed by a neuro- 
surgeon from a nearby university school of medicine and a staff physician from the 
institute itself. The method consisted of first anesthetizing a patient with one or two 
electroshocks, then inserting a Steinman pin beneath the upper eyelid and driving it 
through to the anterior part of the frontal lobe. Medial, lateral, and vertical in- 
cisions were made. Postoperative procedures consisted of routine care and con- 
tinuous observation for twenty-four to forty-eight hours, after which the ambulatory 
patients were gradually moved off the postsurgical ward to other wards commensur- 
ate with their overt behavior. An evaluation of these patients was made shortly 





272 MATTHEW D. MERMELSTEIN 


after the operations, and a paper was published by the two physicians who per- 
formed the surgery, giving a partial evaluation of results®’. The present paper pre- 
sents the findings of a follow-up study of all the patients one and a half to three years 
after the operations. 

The main concern of the follow-up involved an evaluation of the total status of 
each patient after the operation, as compared to his condition before the operation. 
Because no control group was originally selected to serve for comparison purposes 
during the three year leucotomy program, it was not possible to utilize this method 
in the present study. Thus the control group necessarily became the operated group 
itself. Improvement is a very relative term and covers many areas. Various criteria 
had to be utilized, and it was not unusual to find patients responding positively in 
some areas of behavior and at the same time showing negative results in other 
respects. Degree of nursing care, reduction in symptomatic treatment, i.e., electro- 
shock, sedation, restraints, type of domicilary ward, release from the hospital, re- 
laxation of restrictions, etc., had to be weighed against each other. Primarily, most of 
the patients were selected for leucotomy on the basis of chronic illness in which most 
of the other forms of conventional treatment had not succeeded in relieving symp- 
toms. It was hoped that even though complete recovery could not always be ac- 
complished, the operations would reduce anxiety and tension, resulting in a cor- 
responding reduction of violence or other undesirable behavioral features. 


PROCEDURES 


The hundred and four transorbital leucotomies were performed on 102 patients, 
two being repeated. Two patients died as a result of the operation. The evaluation 
is therefore based on a total of 100 patients. Since the exact onset of illness, prior to 
the time patients were hospitalized, is not accurately known of many of the patients 
here reported, the continuous length of time they were in the hospital before the 
operations were performed is used for tabulation. Furthermore, some of the patients 
had had previous attacks with intervals of recovery. Such earlier periods of hospital- 
ization are not included in the present tabulation. 

For this study, then, a breakdown of each patient’s status for the time im- 
mediately prior to the operation was initially compiled from the ward and case 
records. This included age, continuous length of time in the hospital, diagnosis, 
treatment (E.C.T., sedation, insulin), restraints, type and number of seizures, I. Q. 
if recorded, ward residence (maximum security, convalescent, or custodial), type and 
extent of work placement in the hospital, overt behavior (combative, destructive, 
untidy, withdrawn “sitters”, hallucinated, delusional). Compared to this initial 
compilation another more recent evaluation was made in a follow-up one and a half 
to three years after the leucotomies were performed, using as much as possible the 
same ward and nursing personnel, case material, and interviews with each patient 
who still resided in the hospital. 

For final clarification of each patient’s progress or regression these two sets of 
compilations (the initial and the later one) were compared with progress notes made 
for each leucotomized patient soon after the operation (one to three months later) 
by the ward physician who had assisted in the operations. All evaluation data were 
transcribed on a separate index card for each patient. 

In the final evaluation of the patients’ status the following instructions were 
given to the nursing supervisors and ward personnel: “‘I am doing a follow-up study 
of the patients on your ward who have had leucotomies, and I want to know how 
they are now.” After obtaining the supervisors’ statements they were then told: 
“Here is how the patients were at the time of the operation.”” The information pre- 
pared on the index card was read to them and they were then further asked: ‘“How 
does that compare with their condition?” Resulting reports were recorded along with 
the information in each patient’s ward chart regarding current treatment, medica- 
tion, and behavior. If the ward personnel were not acquainted with the patients at 
the time the operation was done, then the supervisors who had known the patients 
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continuously were consulted. Where ward personnel differed in their evaluations of 
the same patient, supervisors were again asked for a final review. In better than 95 
per cent of the cases no discrepancies were found. 

A short interview was conducted also with each patient, but the greatest major- 
ity were in such poor contact they could not carry on any coherent conversations. 
(Only one patient from the group was aware of the fact that he had been giv en a 
leucotomy). 

The evaluations were placed into the following categories: “no improvement”, 

“at least some slight improvement”’, ‘moderate to good improv ement.” No improve- 
ment meant that all, or nearly all, of the outstanding behavioral traits attributed to 
the patient at the time of the leucotomy were unchanged or had become worse, and 
that the amount of supervision and nursing care necessary at the time of the opera- 
tion remained the same or had increased. 

For example: Patient X described as follows: ‘“‘At time of operation patient had 
been continuously hospitalized one year, displayed violent outbursts toward others 
necessitating periodic E. C. T., was actively hallucinated, had epileptic seizures 
poorly controlled, received anti-convulsive medication, needed continuous restraints, 
residing on the maximum security ward doing very occasional light ward work. 
I.Q. recorded as 60 at time of admission. Two and one half years after the operation 
he is on the maximum security ward, in restraint, dangerous toward others, re- 
ceiving E. C. T. as often as before. Mostly a ward sitter. Unable or unwilling to 
respond to I.Q. test questions.” This patient rated “‘Not Improved.” 

Patient Y described as follows: ‘“‘At time of operation had been hospitalized 
thirteen years, was on the maximum security ward in restraint, untidy, occasional 
viole * spells of kicking others, mute at times, in poor contact, receiving occasional 
KE. C. 't., idle sitter. Two years after the operation she is now no longer in restraint, 
is on th? maximum security ward. Still untidy, idle sitter. No ground priv ileges. 
Violent outbursts continue but less occasionally, necessitating less E. C. T.”” This 
patient rated “Slightly Improved.” 

Patient Z described as follows: “At time of operation had been hospitalized nine 
years. Repeatedly placed on maximum security ward, received one to two E. C T. 
every month, extremely quarrelsome displaying violent attacks on others. Loud and 
profane speech. Preoccupied with ideas of persecution. Doing light ward work, no 
ground privileges. One and a half years later patient is now assigned to a work place- 
ment off the ward, been given ground privilege which was revoked but later re- 
stored. Shows occasional combativeness necessitating two to three E. C. T. every 
two months. Has become seclusive. Still argumentative and quarrelsome at times. 
At least marginal adjustment on convalescent ward continuously for one year. This 
patient rated ‘‘Moderate to Good Improvement.” 


RESULTS 

Table 1 is a description of the total sample based on diagnostic category and 
continuous length of time in the hospital. The incidence of degree of improvement 
within each category is represented for comparison purposes. It can be seen that the 
greatest number of cases (67%) were schizophrenics and that the largest number of 
these (29%) were long term chronic cases. However, degree of improvement did not 
seem to depend upon length of time of hospitalization, as shown by the fact that 
in the 1 to 2 year group 17% showed over-all positive results, as compared to the 3 to 
5 year group which showed a 30% gain, the 6 to 10 year group which showed a 
27% gain, and the 11 + year group which showed a 20% gain. The fact that 
certain patients seem to derive some benefit from a transorbital leucotomy can best 
be credited to some unknown variable. This writer prefers to attribute this fact 
to some fortuitous circumstance associated with the operation and hospital care 
subsequent to it. 

Table 2 is a comparison of the evaluations made shortly after the leucotomies 
(one to three months), and the evaluation now one and a half to three years after. 





274 MATTHEW D. MERMELSTEIN 


TABLE 1. CLassIFICATION OF 100 Leucotomy Patients BY CLINICAL DIAGNOSIS, PREOPERATIVE 
LENGTH OF HosPITALIzATION, AND EVALUATION OF RESULTS 








Continuous Length of Time in the Hospital 





Diagnoses 1-2 Years | 3-5 Years 6-10 Years 11 + Years Totals 


| 
| 
= 


’ —/| —__— ee 


N| * | **|**N | * | * ee NT | OF } N | * | #® | #e* 
| 


| | oS ees —— 


Manic Depress. | 2) | | | | 7] | | 








Schizophrenia | | | 6] 3| 2 | | 5 67 49 10 





Epilepsy 





Paresis 





Mental 
Deficiency 





Involutional 
Psychosis 


Paranoid State 





Psychopath 





Totals }18)15) 1) 2/15)10) 4) 1)22)14) 2) 4/45)36) 7 





*Number of cases showing no improvement at the present time. 
**N umber of cases showing at least some slight improvement at the present time. 
***Number of cases showing moderate to good improvement at the present time. 
{Two patients deceased as a result of the operation. 


TABLE 2. COMPARISON OF SHORT AND Lone Term Rar 


INGS OF PATIENTS’ IMPROVEMENT 





a , ae : , 
Evaluation shortly Evaluation now 1144 


Rating | after Leucotomies | years to 3 years after 
1-3 Months 


No Improvement 








At least some slight 
Improvement 44% 





Moderate to good 
Improvement 20% 





Death 2% 





Total | 100% 100% 





From this table it is seen that a rather large number of relapses occurred. Forty- 
one percent of the cases returned to their previous levels of adjustment, with symp- 
tomatic electroshock, restraints, and sedation having to be reinstituted. Gradually, 
many of these patients drifted back into their previous behavioral patterns, being 
returned from convalescent wards to maximum security wards or closely super- 
vised custodial wards, losing ground privileges, being taken off work assignments, 
having to be dropped from group psychotherapy, or being returned to the hospital 
after brief trial visits. In some instances an increase in the amount of nursing care 
of these patients resulted due to more pronounced untidiness or development of 
epileptiform seizures. 

Before the operations 13 patients had seizures. In this group there were 3 
schizophrenics, 1 manic depressive, 1 mental deficient, 8 epileptics. Nine of these 
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patients continue to have epileptiform convulsions after the operation. Of the 4 
patients who no longer manifested this behavior there were two schizophrenics, 1 
manic depressive, and 1 epileptic. This latter patient, however, had not had any 
seizures for four years prior to the leucotomy, and thus the disappearance of this 
symptom could not be attributed to the operation. Following the leucotomies, 6 
patients developed epileptiform seizures who had not previously had this symptom. 
In this group are 5 schizophrenics and 1 involutional psychotic. Larger incidence of 
seizures resulting from psychosurgery has been reported “?. 

Ten of the fourteen patients in the group showing at least some slight improve- 
ment, and 5 out of 9 patients showing moderate to good improvement, continue to 
need symptomatic sedation, restraints, and electroshock, although in lesser amounts 
and with less frequency. 

The largest number of patients selected for leucotomy originally resided on the 
maximum security wards, and it was hoped the operation would result in placement 
of patients in less restricted surroundings with less supervision. At the time the oper- 
ations were done 59 patients were residing on the maximum security wards. Of these, 
28 were immediately returned after the operation and continue to reside there. 
Twelve others at one time or other had to be returned to these wards. Thus, two- 
thirds of the most destructive and violent patients remained on the maximum secur- 
ity wards or subsequently had such outbursts that they needed to be returned, des- 
pite additional electroshock and restraints. Of the 19 patients who remained off the 
maximum security wards, none are well enough at the present time to be placed on 
convalescent wards. Nine are still considered not improved and are on other custod- 
ial wards receiving symptomatic electroshock and restraints. Six others are rated as 
showing at least some slight improvement. Four others have shown moderate to 
good improvement. It should be pointed out, that in the last three years a trans- 
formation has occurred within the hospital setting itself, which, with more staff 
personnel and better trained attendants, makes it less necessary to transfer dis- 
turbed patients to the maximum security wards. 

It has been advanced that leucotomy offers a means of improving patients at 
least to the extent that the supply of hospital patient workers is significantly in- 
creased), Ten patients are now doing some ward work, usually only polishing, 
whereas they did no work previously. One patient who had done no work formerly is 
doing good ward work, and three patients who previously did only occasional work 
are doing good ward work now. Six patients who had done no work at all before are 
working at tasks off the wards. Six other patients who had done only ward work 
previously are now working off the wards. 

Of the 102 patients who underwent surgery, twelve left the hospital. Since then 
six (50°, ) have returned. Of the 6 patients remaining out of the hospital, five are 
still on convalescent leave status, and one has been discharged cured. This patient 
had been hospitalized less than one year at the time of the operation with the diag- 
nosis of Involutional Psychosis. However, she developed epileptiform seizures and 
was receiving anti-convulsive medication at the time she left the institution. 


SUMMARY 


1. Of 102 patients given transorbital leucotomies to reduce outspoken and un- 
controlled behavior and to reduce intensive supervision and nursing care, it was 
found one and a half to three years later, that 75°; showed no improvement, 14°) 
showed at least some slight improvement, and 9°; showed moderate to good im- 


provement. 


2. Reduction of violent behavior, tension, and anxiety was not substantially 
accomplished. Over two-thirds of the most combative and disturbed patients have 
shown no appreciable gains. 


3. A relapse rate of 41% occurred after a more favorable estimate had been 
given following shortly after the operations. 
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4. Twelve patients were released from the hospital but 50°7 of these relapsed 
and had to be returned. 


5. The amount of improvement observed in patients did not seem to depend 
on length of time in the hospital. Improvement can best be described as due to 
fortuitous circumstances. 


6. Evaluation of patients is extremely difficult due to the fact that leucotomy 
apparently causes both gains and losses in behavior and required nursing care and 
supervision. 
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FACTORS INFLUENCING UTILIZATION OF PSYCHOTHERAPEUTIC 
SERVICES IN MALE COLLEGE STUDENTS! 
NORMAN 8. GREENFIELD AND WILLIAM F. FEY 


University of Wisconsin Medical School 


INTRODUCTION 


Study of the literature relating to psychotherapy discloses a curious lack of 
rigorous attempts to investigate the motivational factors which bring the prospective 
patient to the psychotherapist. The search for such references produces mainly 
opinions of the order of those truths which are held to be self-evident and beyond the 
need of experimental verification. Whitaker and Malone, for example, discuss 
‘negative anxiety’ which has a disruptive quality and suggest that “‘. . . such negative 
anxiety provides much of the motivation of the patient seeking psychotherapy.” 
(4, p. 22) Tn discussing prognosis, Dollard and Miller observe that the prognosis is 
good if the patient is extremely miserable, noting that ‘‘The patient’s motivation for 
therapy comes from his misery . . .”’“> »- #3) Rogers: P- 7© says: ‘“‘The individual is 
under a degree of tension arising from incompatible personal desires or from con- 
flicts of social and environmental demands with individual needs. The tension and 
stress so created are greater than the stress involved in expressing his feelings about 
his problems.”’ These statements reflect the general current of present opinion. We 
find the adjectival reference to the prospective patient largely denoting anxiety, dis- 
comfort, tension and misery. 


PROBLEM 


The routine administration of the Minnesota Multiphasic Personality Inventory 
to all freshmen entering the University of Wisconsin over a period of several years 
made possible a post hoc examination of the MMPIs of a sample of male students who 
had sought psychotherapeutic services at the Student Health Psychiatric Clinic dur- 
ing their four year college course. The present study attempts to evaluate certain 
inferred motivational variables derived from the MMPI and to relate those variables 


1The authors gratefully acknowledge the cooperation of Dr. Lewis E. Drake, Director of the 
Student Counseling Center at the University of Wisconsin, who made the protocols available. 
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to the time span intervening between university registration and application for 
treatment. 

The Student Psychiatric Section of this University was established in 1937. 
The availability of this service is publicized in a distinct section of the Student Hand- 
book which is distributed to every student at registration time. A recent survey of 
two hundred and seventy-six patients indicated that of this group 36% were self 
referred, 30°% were referred by the Student Medical Clinic, 19% by faculty and 
deans, 7°; from the Student Counseling Center and 8% from miscellaneous sources. 
All regularly enrolled students are eligible for psychiatric services and may avail 
themselves of this opportunity any time after registration. The sample of the 
present study iné¢ludes 132 male college students and the time intervening between 
registration and application for therapy ranges from a few days to nearly four years. 
Four hypotheses were formulated and subjected to experimental test: 

1. The greater the degree of anxiety present, the sooner the student will apply 
for psychiatric help. 

2. Students who internalize their problems will come for treatment sooner 
than those who externalize their problems. 

3. Those students who experience the greatest amount of subjective discom- 
fort will seek help soonest. 

4. There will be a positive relationship between the degree of pathology and 
time of clinic appearance, i.e., more seriously disturbed students will apply for treat- 
ment sooner than those students who are less seriously disturbed. 


METHOD 


The foregoing hypotheses were operationally defined in terms of certain dimen- 
sions of the Minnesota Multiphasic Personality Inventory which have been pre- 
viously formulated. An attempt was made to use variables which have had demon- 
strated usefulness in exploring changes which are concomitant with psychotherapy, 
thus insuring a degree of sensitivity sufficient to differentiate between groups. 


1. Anxiety is defined in terms of the Welsh Anxiety Index“. The formula is: 
Hs + D + Hy + (D + Pt) - (Hs + Hy) 
PY 2 eenenenetinnetd 
3 


According to Welsh, the Anxiety Index measures ‘that condition attributed to 
patients complaining of subjective feelings such as tension, nervousness, apprehen- 
sion, fear, ete., which is generally accompanied by somatic concomitants—vertigo, 
dyspnea, precordial pain, gastric distress, headache and the like’’. © »- 6 


2. Internalization is defined in terms of the Welsh Internalization Ratio. 
The formula is: 
Hs + D+ Pt 
in = 





Hy + Pd + Ma 


Welsh states: “Subjects who tend to have many somatic symptoms and subjective. 
feelings of stress who ‘internalize’ their difficulties can be expected to obtain values 
above 1.00. Those who tend to act out and ‘externalize’ their conflicts will obtain a 
ratio below 1.00”. ». 7) Welsh® presents ample data to support the contention 
that psychotherapy is associated with a significant lowering of both of these indices. 


3. “Subjective discomfort” is defined in terms of the sum of the MMPI scales, 
Hypochondriasis, Depression and Psychasthenia. These scales have been variously 
termed “‘discomfort’’, ‘‘mood”’ and ‘‘complaint” scales and are usually contrasted 
with the more characterological scales, Hysteria, Psychopathic Deviate and Hypo- 
mania. Gallagher®) has demonstrated that the discomfort scales are sensitive to 
change with psychotherapy. 
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4. Degree of pathology is defined in terms of Gallagher’s Maladjustment 
Scale ®?. This is an index derived from summing the deviant scores on all the items 
of the Hs, D, Hy, Pd, Pa, Pt and Se scales. Gallagher has demonstrated a significant 
difference between pre and post- therapy samples on this measure. 


RESULTS 


The four hypotheses were tested initially by means of F ratios. The cases were 
grouped according to the delay (number of months) between their original testing 
with the MMPI and their appearance in the Clinic. These groupings and the mean 
scores appropriate to each are shown in Table 1. 


TaB_e 1. Group MEANS AND F Ratios FoR THE Four MEASURES 


Groups 
B 


A C F 
Factors Delay (Months) 0-9 10-23 24-45 ratio 


N 56 40 36 


Anxiety 63 .66 52.51 54.97 2.60 
Internalization 98 .32 96.38 95.91 1.16 
Discomfort 178.82 171.50 167 .47 | 1.07 
Pathology 151.21 143.54 139.16] 1.35 


None of the F ratios is significant at or beyond the .05 level, leading to the inter- 
pretation that none of these measures is significantly associated with the delay in 
seeking psychiatric help. However, a slight trend is evident from inspection of the 
means, in that the group which appears earliest for therapy tends to show slightly 
elevated scores on every measure, particularly with respect to the Welsh Anxiety 
Scale, where the group A mean is significantly (p .03) above the combined means of 
Groups B and C. The consistency of this finding suggests, moreover, that the meas- 
ures may have a good deal in common. The overlap among the measures is apparent 
from their intercorrelations given in Table 2. 

TaBLe 2. INTERCORRELATIONS AMONG THE MEASURES 





Welsh Welsh Gallagher 
Measures Anxiety Internalization Pathology 





Welsh Internalization + .70 
Gallagher Pathology + .72 + .35 
Gallagher Discomfort + .80 + .73 + .96 


Each of these correlations is significant beyond the .01 level and, except for that 
between Gallagher’s Pathology and Welsh’s Internalization, substantial. The re- 
markably close association between Gallagher’s indices on this particular population 
doubtless reflects their common score elements and also argues for the use of the 
simpler ‘discomfort’ measure. In general it may be said that these values suggest 
that the four measures are concerned primarily with a single dimension or ‘kind’ of 
disturbance which is not substantially associated with readiness to seek psychiatric 
services, at least in this setting. 

With regard to the use of Gallagher’s index as a measure of pathology it should 
be noted that it would better be called a measure of subjective frenzy since it is based 
solely on the amplitude of the profile. The generally elevated (the so-called ‘‘float- 
ing’) profile is usually associated with a fulminating neurosis and not with the 
graver psychotic conditions. None of the indices examined really accommodates the 
critical fact of ‘‘kind’’ of illness which may well bear on the question of how soon a 
particular patient will show for help. 
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SUMMARY 


The MMPIs of 132 entering male college students, who at some point in their 
four year college course sought psychotherapy at a student health psychiatric clinic, 
were examined with the object of testing the hypothesis that promptness of referral 
is positively related to the degree of anxiety, the extent of internalization, the amount 
of discomfort and the severity of pathology. None of these variables proved to be 
significantly related to the number of months which intervened between the time of 
taking the test and appearance at the clinic. It is concluded that, for the population 
studied, these measures are not indices of therapeutic readiness. 
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MMPI SCORE CHANGES INDUCED BY LYSERGIC ACID 
DIETHYLAMIDE (LSD-25)! 


RICHARD E. BELLEVILLE 


The National Institute of Mental Health 
Addiction Research Center, USPHS Hospital 
Lexington, Kentucky 


INTRODUCTION 


Considerable interest has been shown recently in the action of Lysergie Acid 
Diethylamide (LSD-25), a semi-synthetic drug prepared from lysergic acid extracted 
from ergot of rye. This interest is due chiefly to the striking behavioral changes pro- 
duced by extremely minute quantities of this drug (40-60 micrograms) when ad- 
ministered orally, and the reputed resemblance which these behavioral changes bear 
to schizophrenia. The LSD syndrome was first described by Stoll“, whose observa- 
tions have been subsequently confirmed by a number of other investigators on both 
normal and psychotic subjects“: 4). The syndrome is characterized by mood changes, 
feelings of unreality, feelings of depersonalization, perceptual distortions and visual 
hallucinations. 

In general, those studies dealing with the effects of LSD on individuals with 
various personality disorders have utilized interview and observational data in an 
attempt to determine its diagnostic and therapeutic value. Projective techniques 
have been used to some extent in evaluating the psychological effects of this drug; 
however, data obtained with objective personality measures have been particularly 
lacking. In addition, few reports have appeared in which the conditions of drug ad- 
ministration have been rigidly defined or controlled. The present study was under- 
taken in order to investigate some of the psychological effects of LSD and to eval- 
uate the Minnesota Multiphasic Personality Inventory (MMPI) © as an instrument 
for measuring changes produced by pharmaceutical agents. 


1The present study was part of a comprehensive investigation on the effects of LSD-25; physiolo- 
gical and psychiatric aspects will be reported by Harris Isbell and associates. 
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SUBJECTS AND PROCEDURE 


Twenty-four former narcotic addicts volunteered for the study. All were prison- 
er patients serving sentences for violation of the Harrison Narcotic Act.* The experi- 
ment was carried out on a closed ward where the patients were observed constantly 
by specially trained attendants. A dayroom adjoining the ward was used as a testing 
room. The group form of the MMPI was administered under control (no drug), 
placebo, and LSD conditions, and only the standard instructions were given for com- 
pleting the inventory. Each of the subjects received all three of the experimental 
conditions, with both the order and sequence completely counterbalanced. Sub- 
jects were randomly assigned to each of the six possible combinations of order and 
sequence. To allow for complete recovery from drug effects before the next treat- 
ment was given, the interval between any two treatments was never less than three 
days. 

When the drug was administered, 50 to 130 micrograms of LSD were given 
orally to fasting subjects at 8:30 a. m. Since LSD is colorless, odorless and tasteless, 
water was used as a placebo, which also was given under fasting conditions and at 
the same time of day as LSD. Since maximal effects of the drug occur within two 
hours after administration and persist for three to six hours and occasionally longer, 
the MMPI was given at 10:00 a. m., one and one-half hours after the drug was in- 
gested. Thus, most subjects completed the inventory in about two hours, during the 
height of drug action. Under control conditions, the subjects were not required to 
fast and no water was given, but the inventory was obtained under conditions that 
were otherwise the same as those for the other two administrations. 

Each of the subjects had been given at least one dose of LSD prior to the experi- 
mental administration of the drug. This was necessary in order to determine the 
effective dose for each individual and also served to reduce the novelty of the drug 
experience. The appropriate dose was determined by various physiological measures 
and by observation of the patient’s behavior. Although an attempt was made to 
produce approximately equal intoxication in all subjects, this was not always possible, 
since there is no a priori indicator of responsivity, and since considerable individual 
differences in response to the drug have been observed. 

The MMPI records obtained under these conditions were scored on the four 
validity scales and nine clinical scales, with the addition of the Taylor Scale of Mani- 
fest Anxiety (A scale)“. The correction factor K was added. An analysis of var- 
iance, using a “treatments x subjects” design “’, was applied to the T-scores on each 
of the scales. Critical differences were calculated to test the significance of differ- 
ences between means. 


RESULTS 

Table 1 presents the mean T-scores obtained under control, placebo, and LSD 
conditions, as well as the over-all F-ratios for each scale. Significant variance, attri- 
butable to the effects of the experimental treatments, was found on the Psychas- 
thenia and Schizophrenia scales at the .01 level of confidence, and on the Paranoia 
and Taylor Anxiety Scales at the .05 level. Mean squares for these scales are pre- 
sented in Table 2. On each of these scales the difference between the LSD and con- 
trol mean and also between the LSD and placebo mean exceeded the critical differ- 
ence, whereas no significant differences were obtained between the control and place- 
bo means. It is noteworthy that increased score elevation is limited to particular 
scales. In affecting specific scale changes rather than over-all elevations, confusion 
and misunderstanding of test items may be ruled out. Further evidence for this 
point is seen by inspection of the F-ratios for the validity scales, all of which were 
well within the limits of chance. 


2The population from which the present sample was drawn has been described in terms of the 
most frequently obtained profile patterns on the MMPI. (Hill, H. E., Belleville, R. E., and Glaser, 
R. An Application of the MMPI to the Narcotic Drug Addict. To be published). 
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Tas_e 1. Mean MMPI T-Scores Ostainep UNDER ConTROL, PLAcEBO AND LSD ConpiTIONs, AND 
Over-A.L F-Ratios ror Eacu ScaLe 








Conditions 
Scales Control Placebo LSD-25 





oo 
— 


? 50.00 50.00 00 
L 52.79 53.63 52.63 
F 55.63 55. 
K 50 | 54. 
Hs 55.58 56.7 
D 31.79 64. 5 
Hy 56.04 55 .§ 
Pd 54 70.5 
MeO 57.08 | 56. 
Pa | 9.75 50. 
| 


LE 


~ 


COME et 
Si che th Si 


Pt 33 | 55. 
Se 20 56.4! 
Ma 5.54 } 64. 
A 7.29 48 
**Significant at .01 level. 

* Significant at .05 level. 











TaBLE 2. SuMMARY OF ANALYSIS OF VARIANCE 








Mean Squares 
Source df Pa Pt Sc A 








Treatments 2 165.39 340.68 628. 160 .43 
Subjects p> 204.77 323 .42 223 . 98 424.13 
Treatments x Subjects 32.88 33.10 64. 22 .62 
Total 71 92.30 135.81 132. 156 .57 





Discussion 

Although it is customary for diagnostic purposes to interpret profiles individual- 
ly in terms of patterns of score elevations, the present study was directed toward de- 
termining the effects of LSD which are common to most subjects, making it necessary 
to deal with mean scores. The differences between the means of scale scores under 
control and LSD conditions are interpreted here as reflecting an increase in the num- 
ber of symptoms associated with particular diagnostic categories. The greatest in- 
crease following LSD was obtained on the Se scale, which is not surprising, since this 
scale measures the extent to which patients deviate from conventional ways of 
thinking and reacting and since the LSD syndrome has been compared frequently 
with schizophrenia. Indeed, bizarre and unusual thoughts and overt behavior are the 
most striking features of the LSD psychosis. This is exemplified by depersonaliza- 
tion and distortion of the body or appendages which is frequently reported by 
patients under the influence of this drug. 

Another prominent feature of the LSD syndrome is behavior characterized by 
greater suspiciousness, increased personal sensitivity, and intensified self-observa- 
tion. Such behavior is reflected to some extent by the increase on the Pa scale. 
Although compulsive behavior was not observed in these subjects after LSD, the 
increase on the Pt scale can be accounted for by the frequently reported phobic ex- 
periences (fear of imminent death, fear of going insane, fear of bodily changes). The 
Taylor anxiety scale was also increased significantly; however, the amount of this 
increase did not seem to reflect the high degree of anxiety reported by or observed in 
these patients. 

Although the increased T-scores obtained on these scales for LSD over those of 
the other conditions are not great, the differences are appreciable, especially since it 
is extremely difficult to induce short-term profile changes by experimental methods. 
Furthermore, because of the wide variability in response to LSD, many of the more 
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striking profile changes are obscured by the scores of relatively unresponsive sub- 
jects. Nevertheless, the results demonstrate that the MMPI is sensitive to some of 
the major personality changes induced by LSD. Since the MMPI was designed pri- 
marily as a diagnostic instrument it may prove valuable in clinical settings, e.g. in 
determining the specific effects of drugs on different individuals and in following the 
course and treatment when drugs are used as therapeutic agents. Although it is prob- 
ably insensitive to many actions of drugs, it seems to tap a sufficient number of per- 
sonality variables to provide considerable information on their psychological effects. 
In addition, it has the advantage of standardization on both normal and abnormal 
subjects, making possible comparisons between disorders which occur naturally and 
those which are experimentally produced. 


SUMMARY 

This study was undertaken to investigate some of the psychological effects of 
Lysergic Acid Diethylamide (LSD-25) and to evaluate the MMPI as an instrument 
for measuring personality changes produced by pharmaceutical agents. Twenty-four 
former narcotic addicts were given the MMPI under control, placebo, and LSD con- 
ditions. Analysis of variance showed significant differences in T-scores between con- 
trol and LSD conditions and between placebo and LSD conditions on the Pa, Pt, Se 
and A scales. No significant placebo effects were found. The conclusion was drawn 
that the MMPI is sensitive to some of the major psychological changes produced by 
LSD-25, and the suggestion was made that this inventory could find wider use in 
clinical situations in which drugs are employed. 
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SOME FACTORS ASSOCIATED WITH THE VISUAL THRESHOLD FOR 
TABOO WORDS 


ARON W. SIEGMAN 
Teachers College, Columbia University! 


PROBLEM 

The concept of perceptual defense, developed to explain the finding that nega- 
tively valued stimuli such as taboo words require higher perceptual thresholds than 
neutral stimuli®: * 1%, has been criticized on two major accounts. Postman, Bronson 
and Gropper“*) reported that they did not obtain a raised threshold for taboo words 
when the taboo words and the neutral words were equated for familiarity by means 
of the Thorndike-Lorge semantic count. However, adherents of the perceptual de- 
fense hypothesis have retorted: (a) that taboo words appear less frequently in writ- 
ten English than they do in conversation, and hence the word count which is based 
on the frequency with which words appear in literary sources systematically under- 


_ 3A part of this paper was presented at the 1955 meetings of the Midwestern Psychological Asso- 
ciation in Chicago, Illinois. 
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estimates the familiarity of taboo words“, and (b) in order to test the construct of 
perceptual defense it is necessary to determine whether Ss actually considered the 
taboo words as negatively valued stimuli®’. Though it may be difficult to construct 
a satisfactory familiarity criterion, the demand that the stimuli be personally rele- 
vant seems not only reasonable but also feasible. In the present study Ss will rate 
the stimulus words on a pleasantness-anxiety producing scale. The second major 
critique of the perceptual defense hypothesis is that what appears to be a raised thres- 
hold for taboo words may merely reflect S’s reluctance to mention taboo words in 
E’s presence (response suppression) “?. In the present study an attempt was made 
to cope with this problem. Since response suppression implies that neutral words 
are reported immediately upon recognition but that taboo words are perceived re- 
peatedly before they are reported, it follows that response suppression differentially 
increases the frequency of exposure to the taboo words as compared to the neutral 
words. Assuming that recall is a function of frequency of exposure, then Ss with 
higher thresholds for the taboo words than for the neutral words should have higher 
recall scores for the taboo words as compared to the neutral words than Ss with the 
same or lower thresholds for the taboo words than for the neutral words. 

So far the discussion has been based on the assumption that in a group compari- 
son the threshold for taboo words is either the same or higher than the threshold for 
neutral words. A number of investigations, however, suggest that threatening stim- 
uli may elicit either a higher or a lower threshold, depending on stimulus, situational 
and personality variables: > 7 *). The present study will test the hypothesis, sug- 
gested by a previous study “?, that the direction of S’s differential threshold for taboo 
words depends on S’s learned response tendencies to anxiety arousing cues. Ss who 
respond to anxiety arousing cues with increased awareness and vigilance tend to 
have lowered thresholds for taboo words, and Ss who respond to anxiety arousing 
cues with avoidance, denial and suppression tend to have raised thresholds for 
taboo words. 

Finally, the present study will test the hypothesis, which is suggested particular- 
ly by the personality theories of Rogers“*) and Snygg and Combs“'*) but also by 
other personality theories and research": ?» ©, that the more inadequate one’s self- 
percept the more one will be threatened by taboo words, and hence the more in- 
adequate one’s self-percept the greater one’s differential threshold (higher or lower) 
for taboo words. 


METHOD 


Subjects. All Ss who participated in this experiment were students in Intro- 
ductory Psychology and Psychology of Human Adjustment classes at the Uni- 
versity of Wisconsin. After eliminating all Ss who (a) did not know the meaning of 
one or more of the stimulus words, (b) rated one or more of the taboo words as neutral 
or pleasant, and (c) rated one or more of the neutral words as anxiety provoking, 27 


Ss remained. These ratings were made by Ss after the completion of the perceptual 
task. 


Tas_e 1. Lisv oF Stimutvus Worps wits Tasoo Worps IN ITALICS 


Word Frequency Value 





apple (practice) 
dance (practice) 
bitch 

clove 

belly 

noted 

penis 

mixer 

whore 

terse 





*Frequency of occurrence in 414 million words according to 
the Thorndike-Lorge semantic count. 
**Estimated frequency. 
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Stimulus words. The taboo and neutral words used in this experiment were 
equated for frequency by means of the Thorndike-Lorge semantic count (Table 1). 
One of the taboo words, which does not appear in the word count, was matched with 
a control word which has a frequency of one (i.e., one in 41% million words). In order 
to cope with the objection to the semantic count as an adequate criterion of fam- 
iliarity of taboo words“, a group of 75 Ss, similar to those which were used in the 
perceptual task, were asked to rank the stimulus words in order of the frequency 
with which these words are generally used in daily conversation. In order to avoid 
defensive operations, Ss were asked not to indicate their names. No significant differ- 
ence between the taboo and neutral words was obtained. 


The perceptual task. Each stimulus word was presented by means of a Ger- 
brands tachistoscope, beginning with an exposure time of .01 second. The exposure 
time of each subsequent presentation was increased by .01 second, until the word 
was correctly reported. All Ss were first presented with two trial words. In order to 
cancel order of presentation effects, one-half of the Ss were given the words in the 
order in which they are listed in Table 1, and the other half were presented the stim- 
ulus words in the reversed order. 


Recall and rating tasks. After the completion of the perceptual part of the 
experiment, all Ss were asked to list all the words they recalled being flashed on the 
screen of the tachistoscope. Thereafter, they were given a list of the stimulus words, 
and were asked: (a) to check those words whose meaning they did not know, and 
(b) to rate each word on a seven point pleasantness-anxiety arousing scale. Ss were 
also asked whether they withheld any responses during the perceptual part of the 
experiment. 


Response tendency to anxiety and self-esteem questionnaires. Since recent studies 
3. 15) suggest that Taylor Manifest Anxiety Scale (MAS) “” scores reflect S’s res- 
ponse tendency to anxiety, a low score indicating avoidance and suppression and a 
high score indicating increased awareness and sensitivity to anxiety arousing cues, 
all Ss were administered the Taylor MAS. Ss were also administered a self-esteem 
questionnaire. This questionnaire was developed by submitting Rogers’ definition 
of a positive or a negative attitude about one’s self‘ PP. 42), together with a 
number of items which were thought to be indicative of such attitudes to eight ex- 
perienced clinical psychologists. The judges were asked to designate the items in- 
dicative of a positive or negative self-attitude according to the definition. Table 2 
lists the items on which there was 80° agreement or better, and which were retained 
for the self-esteem or self-concept questionnaire. 


TABLE 2. SELF-ESTEEM SCALE 








Item Scorable Answer 





I find it hard to make talk when I meet new people. F 

I frequently need encouragement. F 

Criticism or scolding hurts me terribly. 

I am entirely self confident. 

T have a need to have others like and admire me. 

I like to know important people because it makes me feel important. 

It makes me feel uncomfortable to put on a stunt at a party even when others 
are doing the same sort of thing. 

I frequently have serious doubts as to whether I have made the right decision 
or done the right thing. 

When in a group of people I have trouble thinking of the right things to talk 
about. 

I am easily embarrassed. 

I certainly feel useless at times. 

At times I think I am no good at all. 

I am unusually self conscious. 

I am certainly lacking in self confidence. 


© 2 NSS SYPr 
seh ie> Me> Me? Me? he Lar bev he? 
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RESULTS 

1. The mean threshold (in hundredths of a second) of the 27 Ss for the taboo 
words was 9.21 with a standard deviation of 4.91, while for the neutral words it was 
8.63 with a standard deviation of 4.47. In a one tailed ¢ test for correlated means 
(t = 1.82) the difference between the two means is significant beyond the .05 con- 
fidence level. However, since investigators have obtained lowered as well as raised 
thresholds for negatively valued stimuli: *: - a two tailed test seems appropriate, 
and hence P = >.05. 

2. Ss with higher thresholds for the taboo words than for the neutral words 
(N = 20) obtained a mean recall score for the taboo words of 3.37 with a standard 
deviation of .87, while Ss with lower thresholds for the taboo words than for the 
neutral words (N = 7) obtained a mean recall score of 2.57 with a standard deviation 
of .80. The difference between the two means yields a ¢ of 2.16 which in a one tailed 
test is significant beyond the .025 confidence level. The recall scores were deter- 
mined by subtracting the number of neutral words correctly recalled from the num- 
ber of taboo words correctly recalled. These scores were then transformed by means 
of the Freeman-Tukey square-root transformation method “!. 

3. Seventy-eight percent of the Ss (21 out of 27) admitted that they witheld 
taboo responses. 

4. Ss with raised thresholds for taboo words (N = 20) obtained a mean Taylor 
MAS score of 15.40 with a standard deviation of 2.42, while Ss with lowered thres- 
holds for the taboo words (N = 7) obtained a mean Taylor MAS score of 21.71 with 
a standard deviation of 8.51. The difference between the two means yields a t of 
2.88 which is significant beyond the .01 confidence level. Assuming that Taylor 
MAS scores refiect S’s response tendency to anxiety then this finding supports the 
hypothesis that the direction of S’s differential threshold depends on S’s learned 
habits in response to anxiety eliciting cues. 

5. The correlation between Ss’ differential threshold scores (which reflect the 


deviation, irrespective of direction, of S’s threshold for taboo words from his thres- 
hold for neutral words) and their self-esteem scores wasr = ~.39. In a one tailed test 
this is significant beyond the .025 confidence level. This finding supports the hy- 
pothesis that the extent of S’s differential threshold for taboo words is a function 
of the adequacy of his self-percept. 


DIscUSSION 

The results indicate that when the taboo words and neutral words were con- 
trolled for frequency, the group as a whole did not require a significantly higher or 
lower threshold for the taboo words. This finding supports the hypothesis suggested 
by Postman et al“*® and others“ that the raised threshold for taboo words which 
has been reported by a number of investigators is probably a function of the fact 
that they did not control for frequency. 

The fact that the group as a whole did not have a raised threshold for the taboo 
words does not prove, however, as has been assumed by some investigators, that 
taboo words do not elicit a differential threshold. The failure to obtain a group 
differential threshold may be a function of the fact that some Ss respond to taboo 
words with a raised threshold and others with a lowered threshold, and that these 
individual differences are canceled out in a group comparison. That these individual 
differences are not merely a function of the usual between Ss variability, but rather 
also of individual differences in response to the taboo words is supported by the fact 
that a significant negative correlation was obtained between Ss’ differential threshold 
scores and their self-esteem scores. This finding implies that a lowered threshold for 
taboo words does not indicate that the S was not threatened by the taboo words and 
therefore did not engage in defensive operations, but rather that both lowered as 
well as raised thresholds are a response to the threat which is presented by the taboo 
words. 

The fact that Ss with raised thresholds for taboo words obtained significantly 
higher recall scores for the taboo words than Ss with lowered thresholds for taboo 
words, as well as the fact that 79° of the Ss reported that they withheld the report 
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of taboo words, suggests that response suppression is responsible for a significant 
portion of the variance of raised thresholds for taboo words. Response suppression, 
however, is obviously not responsible for lowered thresholds, and probably does not 
account for the total variance of raised thresholds: '*). However, the fact that the 
total variance of differential thresholds for taboo words cannot be e oxplaine -d in terms 
of response suppression, does not necessarily point to the presence of unconscious 
defense mechanisms. A number of investigators have pointed out that raised as well 
as lowered thresholds for taboo words can be explained in terms of general perceptual 
principles such as selective sets % 12) M4, pp. 508-04) | 


SUMMARY 


1. The perceptual threshold of 27 undergraduate students for taboo words was 
not significantly different from their threshold for neutral words. It was suggested, 
however, that no group differential threshold was obtained because some Ss respond 
to taboo words with a lowered threshold and others with a raised threshold, and that 
in a group comparison these individual differences are canceled out. 

2. Ss with raised thresholds for taboo words also had higher recall scores for 
the taboo words than Ss with lowered thresholds for taboo words. This as well as the 
fact that 79°, of the Ss admitted withholding taboo responses, suggests that response 
suppression contributes significantly to the variance of raised thresholds for taboo 
words. It was pointed out, however, that response suppression does not account for 
the total variance of differential thresholds for taboo words. ; 

3. Ss with raised thresholds for taboo words had significantly lower Taylor 
MAS scores than Ss with lowered thresholds for taboo words, which supports the 
hypothesis that the direction of S’s differential threshold depends on S’s learned 
response tendencies to anxiety eliciting cues. 

4. There was a significant negative correlation between S’s differential thres- 
hold scores and their self-esteem scores, which supports the hypothesis that the ex- 
tent of S’s differential threshold for taboo words is a function of the adequacy of 
S’s self-concept. 
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ANXIETY AND GOAL-SETTING BEHAVIOR 
PHILIP HIMELSTEIN 


Veterans Administration Hospital, Roanoke, Virginia 


PROBLEM 

Several studies have presented results which indicate that the anxious subject 
(S) differs in goal-setting behavior from the normal 8 and describe a characteristic 
pattern for anxious Ss of setting goals below performance so as to avoid failure: *?. 
In a recent study, Ausubel et al? were unable to find any over-all differences in goal 
settings between anxious and nonanxious Ss drawn from a college population. The 
present study is an attempt to compare the goal-setting behavior in both an ‘‘ego- 
involved” and “non-ego-involved”’ situation of college students divided into three 
groups on the basis of scores on the Taylor Anxiety Scale? and a clinically anxious 
group. 


METHOD 

Subjects. A group of 112 college students were divided into a low anxious (7 or 
less), middle anxious (8-20), and high anxious (21 or higher) on the basis of group 
testing scores with the Taylor Anxiety Scale. Thirty clinically anxious Ss were 
selected from an out-patient or open-ward psychiatric population on the basis of a 
rating of manifest anxiety, which was made by the psychiatrist or psychotherapist 
who was directly associated with the patient. These patients were nonpsychotic 
and had symptoms of uneasiness and apprehension. 

Each of the four experimental groups contain an approximately equal number of 
males and females. All groups are restricted to the age range 18-35, and a minimum 
of a high school education was required. An intelligence level of average or better 
was also required. The sex, age, educational level and Taylor scale characteristics 
of the four experimental groups are summarized in Table 1. 

TABLE 1. Spx, Acr, EpucaTIONAL, AND see CHARACTERISTICS OF THE EXPERIMENTAL 

yROUPS 








Mean 

Group Males Females Total Mean Mean Edu- Taylor 

{ = 7 N =68 N =142 age cation (yr) Score 

Low anxiety 18 36 21.8 4.4 
Middle anxiety 2 20 40 21.3 3.t 12.9 
High anxiety 19 36 20.9 3.¢ 24.5 
Clinical anxiety ( 11 30 24.5 2.4 29.9 








Procedure. A 25-turn finger maze and a series of digit symbols were selected for 
the experimental tasks. All Ss were tested individually. In all cases, the situation 
considered to be relatively ‘‘non-ego-involved”’ was given first. The tasks were so 
assigned that half of the Ss in each group would have the maze as the ego-involved 
task and half would have the digit symbols in this situation. 

In order to arouse a feeling of ego-involvement, an adaptation of the procedure 
employed by Glixman®? was used. 8 was told, for the non-ego-involved task, that 
FE was developing a new test, that E was not interested in S’s score as such, but that 
he was concerned with obtaining an average from many people. Following this task, 
S was told that he was being given another task so that his performance might be 
compared with the scores of other individuals. It was assumed that if S felt that his 
scores on these tasks would be used in some way to evaluate his ability, he would fee! 
that his self-esteem was at stake. 

The scores given to the Ss were the number of errors per trial for the maze and 
the number of boxes correctly completed in thirty seconds for the digit symbols. 
Ss were given prearranged scores rather than actual performance scores. Goal set- 
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tings were obtained as follows: after a trial, E said, ““You made ascore of .... on this 
trial. What score do you expect to make on the next trial?’’ This procedure was 
followed for eleven trials, so that ten estimations were obtained for each task. The 
goal discrepancy score for a task is the algebraic sum of the discrepancies between 
aspiration and prearranged performance. 


RESULTS 
The mean goal discrepancies for the two tasks under the two degrees of ego- 
involvement are summarized in Table 2. The scores of the males and females within 
TaBLe 2. MEAN Goat-DISCREPANCIES FOR THE Two EXPERIMENTAL TaskS UNDER EaGo-INVOLVED 
AND Non-EcGo-INVOLVED CONDITIONS FOR THE Four EXPERIMENTAL GROUPS 

















Ego-Involved | Non-Ego-Involved 
Maze Digits | Maze Digits 
Group Mean SD Mean 5 Mean 3 Mean 





9.1 15.3 
6.0 13.0 
7.4 11.9 
3.6 17.3 





each group have been combined in the absence of a significant sex difference. An 
examination of the four one-way analyses of variance (see Table 3) reveals that none 
of the F tests is significant at the .05 level of confidence (a value of 3.99 is required 


TasBie 3. ANALYSIS OF VARIANCE Ratios For Two Tasks 
Unpber (A) Eco-INVOLVEMENT AND (B) Non-EGo-INVOLVEMENT 








Task Condition 








Maze A 
Maze B 
Digit Symbols A 2.05 
Digit Symbols B 1.26 





at the .05 level with 3 and 67 df’s). Thus, it can be concluded that there is no sig- 
nificant difference between the mean discrepancy scores for the four groups that is 
not attributable to chance under both ego-involved and non-ego-involved condi- 
tions as defined in this study. It appears that there is no general tendency for anx- 
ious Ss, classified either on the basis of extreme Taylor Scale scores or clinical judg- 
ment, to set goals below the level of performance or below the goals of nonanxious Ss. 


SUMMARY 
1. 36 low anxious, 40 middle anxious, and 36 high anxious college students 
divided on the basis of Taylor Scale scores were compared with 30 clinically anxious 
Ss for goal settings on two level of aspiration tasks. 


2. The differences between the four groups on both tasks under conditions of 
“ego-involvement” and “‘non-ego-involvement” were found to be non-significant. 
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THE REPEAT RELIABILITY OF CLINICAL JUDGMENTS 
OF TEST RESPONSES* 


WILLIAM A. HUNT AND FRANKLYN N. ARNHOFF 


Northwestern University University of Nebraska College of Medicine 
and the Nebraska Psychiatric Institute 


PROBLEM 


In connection with the development of some standardized scales for evaluating 
disorganization in schizophrenic thinking as revealed by schizophrenic responses to 
items on the Vocabulary and Comprehension subtests of the Wechsler Bellevue scale, 
the authors had the opportunity of sampling the inter-judge agreement among a 
group of experienced, practicing clinical psychologists®’. The employment and 
geographical stability of this group has made it possible to repeat these observations 
over intervals of 3 months and 18 months after the original testing. A further ‘‘cross- 
validational”’ group of clinical psychologists from all areas of the country has been 
added. In view of the importance of “reliability” in the clinical judgmental process, 
the data seem worthy of report. As used loosely in this article the term “reliability” 
refers to the agreement of individual clinicians with the mean judgment of the 
group, the repetition of this measure after the passage of time, the test-retest stabil- 
ity of each individual judge’s ratings, and the stability of the findings for judge- 
group agreement when obtained from a new group of subjects. 


PROCEDURE 

The original subjects were 16 practicing clinical psychologists in the Chicago 
area. All had the Ph.D. degree and at least 4 years of professional experience. The 
task involved judging the severity of the pathology exhibited in schizophrenic test 
responses using a 7 point scale for severity. The stimuli were 50 schizophrenic res- 
ponses to items from the Vocabulary subtest and 50 responses from the Compre- 
hension subtest of the Wechsler Bellevue Intelligence Scale, Form I. The instruc- 
tions carefully limited the task to rating ‘“‘how schizophrenic each of these responses 
is,’ and specifically excluded such factors as prognosis, chronicity, therapeutic in- 
dication, etc., as bases for judgment ®’. The task was repeated on the same group at 
intervals of roughly 3 months and 18 months following the original judgments. The 
subjects did not know they would be retested. Since the subjects were limited to 
one small geographical area, the experiment was repeated with another group of 
clinical psychologists of comparable experience drawn from all over the United 
States. 

Several measures of stability or reliability of individual performance were used. 
Inter-individual agreement was measured by comparing the ratings of each in- 
dividual for the various items with the mean value assigned each item by the other 
15 judges. Pearson’s r was the correlational formula used. These values were ob- 
tained for all three testings. As a measure of the individual consistency of the 
ratings, we compared each judge’s performance on the first testing with his per- 
formance on the second, again using Pearson’s r. To check the representative nature 
of our findings, we obtained the same measure of inter-individual agreement on the 
second or ‘‘cross-validational” group. Finally, to be sure that both experimental 
groups were performing a comparable task, a Pearson’s r was obtained between the 
group means for each item for the two groups of §’s. 


RESULTS AND CONCLUSIONS 


The data are reported in Table 1. Individual agreement with the group shows a 
range of r’s from .73 to .92 on the first testing, .69 to .95 on the second, and .81 to 


*This study is part of a larger project continuing under ONR contract 7 onr-450(11) with North- 
western University. The opinions expressed, however, are those of the individual authors and do not 
represent the opinions or policy of the Naval service. 
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TABLE 1. CoRRELATION COEFFICIENTS FOR JUDGMENTAL AGREEMENT 








| 
Vocabulary Comprehension 








Subject with Group Test Subject withGroup Test | New Subject with Group* 
Retest | Retest 
II Il [&ll I I&II} S Vocab. Comp. 
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.92 on the third for Vocabulary; and for Comprehension, .64 to .88, .66 to .86, and 
.71 to .89 respectively. Test-retest coefficients for each individual between first and 
second testing range from .65 to .91 for Vocabulary, and .68 to .90 for Comprehen- 
sion. These figures indicate that satisfactory reliability may be obtained for clinical 
judgments on such a well defined task. That they are not specific to one sampling of 
clinicians is shown by the range of .59 to .92 for Vocabulary and .63 to .90 for Com- 
prehension found with the ‘“‘cross-validational” group. The correlation between the 
two groups for the group means for each item was .93 for Vocabulary and .96 for 
Comprehension, supporting our assumption that both groups were operating in 
comparable fashion. 

The task set our clinicians, namely, the qualitative evaluation of verbal test 
responses, is certainly a typical one. The reliabilities reported here indicate that in 
such a familiar clinical situation, acceptably reliable judgments may be obtained. 
Their appearance in both groups tested indicates that their significance is not an 
artifact of sampling. They confirm the authors’ previous study on scaling ®?, but are 
in sharp contrast to two earlier studies of ours": *’ in which the reliabilities reported 
were much lower. It is our opinion that the higher reliabilities reported here are a 
function of the careful instructions issued. As one of us has said elsewhere,“ ‘When 
dealing with experts in a judgmental situation, the task should be well defined and 
the criteria set forth clearly. Otherwise the riches of knowledge may yield con- 
fusion rather than clarity.”’ 
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DISTRACTION AND AFFECTIVE DISTURBANCE! 
G. A. FOULDS 
Runwell Hospital, Wickford, Essex, England 


INTRODUCTION 


In a previous paper? it was reported that certain modifications had been made 
in the administration and scoring of the Porteus Mazes with the object of emphasiz- 
ing the temperamental factors involved in its performance. It was found that de- 
pressives, whether psychotic or psychoneurotic, tended to work more slowly than 
psychopaths, hysterics, anxiety states or obsessionals. It was thought that the slow- 
ness of depressives might be due, at least in part, to their attention being divided 
between the test and their own affective disturbance “). This hypothesis was derived . 
from observations made during earlier clinical use of the test. Melancholics would 
mutter about their ineptitude, unworthiness, shamefulness and so on during the per- 
formance. It seemed probable that those who did not verbalize such ideas were 
nevertheless similarly engrossed. In the same year Davis“? wrote: ‘test stimuli 
which evoke responses from healthy subjects fail to do so or fail even to interrupt 
their preoccupation in the painful incidents which they tend to recall from the 
recent or remote past.” 

In order to test the ‘retardation - divided attention’ hypothesis a technique had 
to be devised to break up the melancholic’s absorption. It was decided to try dis- 
traction. Assuming that depressives would be unable to attend to three activities 
concurrently, one or more must be excluded from awareness. If attention to the 
maze tracing or to the counting be sacrificed, the subject has withdrawn from the 
field. This, in fact, occurred in two cases in the investigation reported below. The 
agitation of two melancholics was so all-pervasive that they could not be induced to 
perform at all. In the remaining thirty cases, attention to the tracing and counting 
was sufficient to keep them in the field. This does not, of course, imply complete 
obliteration of awareness of affective disturbance throughout the test. Inter-individ- 
ual differences in the effect of counting may be related to rapid alterations of at- 
tention. It was thought that, if the ‘blotting out’ process took place with the de- 
pressives, their speed of performance would increase, while the speed of hysterics 
would not increase since their attention was presumed not to be divided. Both these 
predictions were upheld by the experimental findings“. The evidence from this and 
the previous investigation suggests that the effect of counting is greater the more 
intense the preoccupation with affective disturbance until a point is reached when it 
ceases to be effective at all. The present investigation has two principal objects: 
(a) to improve the accuracy of the speed measurement, and (b) to confirm or refute 
the original findings. 


PROCEDURE 


In the previous investigation ® subjects covered different distances, since vary- 
ing amounts of tracing occurred in blind-alleys in the maze. This has been obviated 
by marking off each blind-alley at the entrance with a thick red line. The test was 
given with and without a distracting stimulus and will subsequently be referred to 
as D (Distraction) and C (Control) respectively. The instructions were as follows: 


“You have to trace through from here to here without lifting the pencil or 
crossing any lines, red or black. (Whilst you are doing this, you have to repeat 
numbers after me. I will count like this—4, 7, 1,6... and you have to repeat 
the numbers after me at the same time that you are tracing through the mazes). 
Wait for the word ‘Go’ because the time taken is recorded.” 


1] wish to thank Dr. R. Strom-Olsen, Physician Superintendent of Runwell Hospital, for per- 
mission to publish this communication, and Miss Monica Creasy of Bedford College, University of 
London, for her statistical advice. 
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These instructions were given for the first two mazes—Years 5 and 6. Year 5 was 
treated as a practice maze and was not scored. 

Two matched groups of subjects were used. Group A consisted of 9 depressive 
women (7 of whom were psychotic); 5 depressive men (2 of whom were psychotic) 
and one male anxiety state. Group B comprised 9 depressive women (7 psychotic) ; 
5 depressive men (3 psychotic) and one male anxiety state. The mean ages were: 
Group A: 49.27 (s.d., 10.62); Group B: 48.20 (s.d., 12.75). 

Group A was given the 11 mazes (10 scored) without any counting (C-1); then 
the 11 mazes with counting twice (D-2 & D-3) and, finally, without counting (C-4). 
For Group B the sequence was D-1; C-2; C-3; D-4. All four runs were given in the 
one session. 


RESULTS 


For each subject, the time score at each administration was taken as a percent- 
age of his total time score for all four runs. Table 1, therefore, shows for each admin- 


istration, the mean score for each group, the difference between ‘Control’ and ‘Dis- 
TaBLe 1. THe Mean ‘Trwe’ Scores ON THE Mazes FOR 2 DEPRESSIVE 
GROUPS AND THE DIFFERENCES BETWEEN ‘CONTROL’ AND ‘DISTRACTION’ 
PERFORMANCES 
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traction’ performance together with the respective standard errors, t and P values. 
Since C 1-D1,C2-D 2and C3 - D3 are each significantly different from zero, it 
is clear that distraction does speed up the work of depressives. C 4 - D 4, though in 
the expected direction, is not significantly different from zero. It may be that by the 
fourth run the subjects are approaching physiological limits. Alternatively, the 
effect of distraction may extend in time beyond the actual counting. As the differ- 
ence between (D 2 - D 3) and (C 2 - C 3) was not significantly different from zero 
(t being less than 1), it would appear that the effect of practice on the speed of de- 
pressives is similar whether the test be given consecutively with, or without, dis- 
traction. 

Abolition of all choice-points in the mazes reduced the intra-individual var- 
iability of the speed measures and thus enabled statistically significant differences to 
be obtained with half the number of subjects used in the original experiments. Aboli- 
tion of choice-points should also increase the range of subjects to whom the pro- 
cedure can be applied. 


CONCLUSIONS 
The prediction that distraction in the form of simultaneous counting would re- 


sult in the speeding up of the tracing of depressive patients on a simple maze task has 
been confirmed. 


REFERENCES 


1. Davis, D. R. Recovery from Depression. Brit. J. med. Psychol., 1952, 25, 104-113. 

2. Fou.ps, G. A. Temperamental differences in Maze performance: I, Brit. J. Psychol. 1951, 42, 
209-217. 

3. Founps, G. A. Temperamental differences in Maze performance: II. Brit. J. Psychol., 1952, 43, 
33-41. 





BODY-CONCEPT DISTURBANCES OF PATIENTS WITH HEMIPLEGIA! 
FRANKLIN C. SHONTZ 
Highland View Mospital, Cleveland, Ohio 


PROBLEM 

Medical authorities who are concerned with the rehabilitation of hemiplegic 
patients have recognized the special importance of the psychological problems these 
individuals present“: ® *). Since many of these problems probably stem from con- 
flicts aroused by the real need for the hemiplegic person to adapt his conceptual 
thinking to a very suddenly and severely altered body-structure ®: * 5), it would be 
worthwhile if difficulties of this type could be shown to be demonstrable on objective 
instruments of psychological measurement. The purpose of the present research, 
therefore, was to devise and evaluate instruments for the measurement of body- 
concept disturbances in samples of hemiplegic and non-hemiplegic individuals. 

METHOD 

Subjects. Three research samples of 16 persons each were drawn from the in- 
patient population of a hospital for people with severe chronic illnesses. The first 
sample consisted of hemiplegic patients with cerebral lesions in the dominant hemi- 
sphere (group ‘“HD’’; mean CA = 53.2;SD = 11.9). The second sample consisted 
of hemiplegic patients with cerebral lesions in the non-dominant hemisphere (group 
“HND”; mean CA = 56.6;SD = 12.1). The third sample consisted of patients with 
diagnosed physical illnesses other than hemiplegia (group ““CI’’; mean CA = 52.2; 
SD = 11.6). In addition, a fourth sample of 16 persons was drawn from the Volun- 
teer and Maintenance staffs of the same hospital (group ‘““N’’?; mean CA = 56.1; 
SD = 9.5). None of these subjects had any apparent incapacitating illnesses, and all 
were living active, independent lives at the time of the research. 


Measurements. Two measuring instruments were used. The first was termed the 
Hemiplegia Research Instrument (HRI) and was derived from a subtest of the Eisen- 
son Examination for Aphasia®?. The second was a relatively non-projective adminis- 
tration of the standard Draw-A-Person technique (DAP). 

In the administration of the HRI, the subject was asked to point to six different 
parts of his own body (eye, foot, ear, shoulder, leg, and elbow) as the names of these 
parts were called off, in order, by the examiner. Often, when asked to “‘point to your 
elbow’’, certain subjects would attempt a unilateral, rather than a bilateral designa- 
tion. That is, they would try to point to the left elbow with the left hand, for example. 
This response was noted when it occurred; and both the correctness and the lateral 
locations of the patient’s other designations were also recorded on standard research 
forms. An additional test of this same type, with specific reference to the right or 
left location of four body-parts, served to evaluate each subject’s comprehension of 
this lateral distinction. The following “signs” of body-concept “disturbance” were 
defined for the HRI: 

1. Unilaterality. Designation by the subject of parts on one side of the 
body only. 

2. Inefficiency. Designation of the elbow in a unilateral fashion, as des- 
cribed above, without spontaneous recognition of the simpler bilateral tech- 
nique. 


3. Confusion. Two or more uncorrected misidentifications of body parts, 
or more than one incorrect response to the right-left distinction test. 


The DAP was administered without subsequent projective questioning, and the 
subject was allowed freedom to draw as much or as little of the iigure as he chose. The 
following ‘‘signs’”’ were defined for the DAP: 

1. Incompletion. Omission from the drawing of any of the following: head, 
trunk, arm(s), or leg(s). 


1Prepared in cooperation with the Department of Physical Medicine and Rehabilitation, High- 
land View Hospital, M. Peszcezynski, Chief. 
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2. Primitivization. Judged on a gestalt basis as either severe (drawing 
grossly unrecognizable) or not severe (drawing simplified and immature, but 
grossly recognizable as a human figure). 


Procedure. Both measuring instruments were administered to all subjects, and 
each protocol was scored for the presence or absence of the various “‘signs’’. Fre- 
quency of appearance of the signs in the various groups was analyzed by chi square, 
corrected for small expectancies, and contingency tables were set up, where possible, 
to examine possible inter-relationships between variables. The lower limit of statis- 
tical significance was set at P = .01. 


RESULTS 

Table 1 presents the results which reached or approached significance in the 

statistical analysis. Chi square values calculated on the basis of these data indicated 

that both body-concept inefficiency and body-concept confusion occurred with sig- 
TaBLeE 1. FREQUENCY OF OccURRENCE OF Bopy-Concept “D1IsTURBANCES”’ IN 

THE Four SAMPLE SUBGROUPS 


Sample Subgroup 
HND | CI 
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*Body-concept unilaterality was considered a sign of disturbance only 
when it occurred on the non-dominant body-side. 


nificantly greater frequency in the HD group than in any of the other samples. No 
basis was apparent for inferring any differences to exist between the HND, the CI, 
and the N groups which were examined. 

The results obtained from the measurement of body-concept unilaterality actual- 
ly failed slightly of statistical significance, but these data are also presented in table 
1 because a phenomena was inherent in this analysis which deserves special mention. 
The unilaterality which characterized group N was not qualitatively equivalent to 
that which characterized group HD. In every case, the unilaterality of the normal 
subjects was present on the dominant (usually the right) side of the body. That is, 
when asked to point to body parts, the normals most frequently pointed to these 
parts on their own dominant body side. In the HD group, however, the unilaterality 
appeared on the non-affected, or previously non-dominant side of the body (usually 
the left), in every case but one. It was apparent, then, that while unilaterality ap- 
peared frequently in both the N and HD groups, its manifestations were quite dis- 
tinct in the two samples. 

The analysis of the frequency of occurrence of ‘‘multiple signs’’ in the individual 
subjects revealed that the subjects of group HD showed by far the greatest tendency 
to possess more than two such signs per patient. This analysis reflects the degree of 
concentration of disturbances in subjects considered separately, and again it was ap- 
parent that individuals in the HD group showed more evidence of such disturbances 
than did the individuals of any other group employed. 
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Contingency tables, which were set up to examine possible inter-relationships 
between variables, showed no significant correlations to exist. 


DISCUSSION 


The results of the present study support the hypothesis that body-concept dis- 
turbances are frequently-appearing concomitants of hemiplegia, particularly when 
the hemiplegia is the result of a lesion in the dominant cerebral hemisphere. The 
HRI was found to show very significant differences on two variables and important 
qualitative differences on the third, although the DAP was found to be no more 
effective in the present investigation than it was when employed by Prater? in an 
sarlier research. 

It is not surprising, of course, that paralysis of the dominant, leading body- 
side should result in greater psychological trauma than paralysis of the supporting, 
non-dominant portion. How great the effects of such a dominant-side paralysis may 
be is illustrated by the finding that the ‘‘dominant-side unilaterality”’ which charact- 
terized most of the normal subjects was almost completely reversed in the dominant 
hemiplegia sample. This suggests an effort on the part of these hemiplegic subjects 
to transpose their entire manipulatory orientation toward their environment. 

Detailed clinical information, available for many of the subjects, showed no 
basis for postulating any continuous relationship between the presently investigated 
signs of body-concept disturbance and the degree of expressive aphasia or intellectual 
deterioration. The latter could scarcely have been a significant factor in any case, 
because the HND group also possessed organic brain pathology, but did not evidence 
the extent of body-concept disturbance characteristic of the patients with dominant 
cerebral lesions. 

To explain the present findings, it is possible to postulate a specific kind of 
‘“body-concept aphasia’? which may affect the particular aspects of associative or 
cognitive functioning related to body senses in general. If such an aphasia exists it is 
significant in and of itself and might well be considered a basis for many problems of 
rehabilitation and perhaps also for emotional maladjustments in certain groups of the 
physically disabled. 


SUMMARY 


Four groups of sixteen subjects each were tested with two instruments designed 
to indicate the presence or absence of disturbances in body-conceptions. Groups were 
matched by groups for chronological age, and all subjects were selected from patients 
and staff of a hospital for people with chronic physical illnesses. The groups were 
composed of hemiplegic patients with dominant cerebral lesions, hemiplegic patients 
with non-dominant cerebral lesions, patients with chronic physical illneses other 
than hemiplegia, and normal, healthy individuals. 

Hemiplegics with dominant cerebral lesions were found to exhibit a significantly 
higher number of signs of body-concept disturbance, both on a group and on an 
individual basis. These findings were discussed and tentative explanations of the 
data were presented. 
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PERSONALITY CORRELATES OF DUODENAL ULCER AND OTHER 
PSYCHOSOMATIC REACTIONS 


PETER M. LEWINSOHN 
State Hospital, Osawatomie, Kansas! 


PROBLEM 


In the course of a recent experiment? designed to test certain propositions 
with regard to individual differences in physiological reactivity to stress, the MMPI 
and the Rosenzweig Picture-Frustration Study (PFS) were administered to groups 
of patients with duodenal ulcer, essential hypertension, and neuromuscular tension. 
This paper is concerned with a comparison of the MMPI and Rosenzweig PFS pro- 
files produced by these groups of patients. 


MetTHop 


The subjects of this study were patients at the VA Hospital, Perry Point, 
Maryland.2? They were selected by the experimenter on the basis of their case 
histories. Four groups of subjects, each consisting of 15 male patients, were used. 
Group I (Control) consisted of nonpsychiatric patients the majority of which had 
received a diagnosis of hemorrhoids and hernia. Group II (Anxiety) consisted of 
neurotic patients in whom neuromuscular tension, in the absence of organic path- 
ology, was a conspicuous symptom. Most of these patients had diagnoses of de- 
pressive reaction and anxiety reaction. Group III (Ulcer) consisted of nonpsychiatric 
patients with a diagnosis of duodenal ulcer which in all cases was based on roentgeno- 
logic findings. Group IV (Hypertension) consisted of nonpsychiatrie patients with 
hypertension in the absence of renal or other organic pathology. The mean blood 
pressure of this group was 196 mm Hg systolic, and 117 mm Hg diastolic. The cri- 
teria for the inclusion of a subject in the experimental groups, and other information 
about the subjects, are given elsewhere). 

The group form of the MMPI was administered to all subjects. Since many 
subjects answered only the first 366 items of the test, the K correction factor was not 
scored. 

The Rosenzweig PFS was administered individually to each subject. The res- 
ponses were scored for direction of aggression-extrapunitiveness (E), intropunitive- 
ness (I), and impunitiveness (M)—and for reaction type-obstacle dominant (O-D), 
ego-defensive (E-D), and need-persistive (N-P). The records were scored in ac- 
cordance with the revised scoring manual“. 


RESULTS AND DISCUSSION 

The differences between the groups on the MMPI variables are shown in 
Table 1. The means of all the MMPI subscales obtained by the Anxiety group are 
consistently greater than those obtained by the Control group. While no significant 
differences between the Control group and the Ulcer and Hypertension groups on 
the Interest (Mf), Hypomania (Ma), and Schizophrenia (Sc) scales could be demon- 
strated, the Ulcer and Hypertension groups have significantly greater mean scores 
on the Hypochondriasis (HS), Depression (D), Hysteria (Hy), and Psychopathic 
Deviate (Pd) scales than the Control group (p <.01). The mean score of the Hyper- 
tension group on the Psychasthenia scale (Pt) is also significantly greater than that 
of the Control group (p<.05). Whereas the Anxiety group has significantly greater 
mean scores than the Ulcer and Hypertension groups on the scales for D (p<.05), 
Pd (p<.05), Pt (p<.01), Mf (p<.01), Pa (p<.01), Se (p<.01), and Ma (p<.01), 
no significant differences between these groups on the Hs and Hy scales could be 
demonstrated. 


1This study was carried out while the author was at the Johns Hopkins University, Baltimore, 
Maryland. 
*The author wishes to thank Drs. G. Bart Stone and John H. Vitale for their kind cooperation. 
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TaB_E 1. MMPI Mean T-Scores OBTAINED BY THE EXPERIMENTAL AND CoNTROL GROUPS 
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It is interesting to note the similarity between the MMPI profiles of the Ulcer 
and Hypertensive groups, and between both of these groups and the Anxiety group 
on most of the “neurotic” scales. Whereas these results do not necessarily support 
any particular theory concerning the etiology of duodenal ulcer and essential hyper- 
tension, they do indicate that patients in these two groups tend to be more emotion- 
ally disturbed, as measured by the MMPI, than ‘“‘normal”’ subjects. 

In Table 2 the four groups are compared with respect to direction of aggression 
and‘reaction type on the Rosenzweig PFS. No difference attained is statistically 
TABLE 2. Rosenzwe1G PFS Mean Scores AND STANDARD DEVIATIONS FOR 
DIRECTION AND REACTION Types OBTAINED BY THE EXPERIMENTAL AND 

ContTrRoL Groups. 
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significant. The Rosenzweig PFS was included in the test battery because of the 
importance which is ascribed to the conflict over the expression of aggressive im- 
pulses in essential hypertension. On the basis of clinical observations, Saul“ and 
others“), have emphasized the presence of chronic, intense, and inhibited hostile 
impulses in patients with essential hypertension. On the assumption that a relative 
predominance of one kind of direction of aggression on the Rosenzweig PFS is re- 
lated to the subject’s characteristic mode of expressing aggression in real life situa- 
tions, it was hypothesized that the Hypertension group should have a lower extra- 
punitiveness score than the Control group. Although the resuits are in the right 
direction the null hypothesis could not be rejected. 

The difficulties associated with the interpretation of negative evidence when 
neither the theory under consideration, nor the validity of the test are firmly estab- 
lished, have recently been discussed by Cronbach and Meehl?. Two possible inter- 
pretations to account for the failure of the Hypertension group to differ significantly 
from the Control group on the Rosenzweig PFS are: (1) The Rosenzweig PFS, in its 
present form, does not measure the difficulties with the expression of aggressive im- 
pulses that are postulated in association with essential hypertension; and/or (2) The 
theory about essential hypertension requires modification. On the basis of the 
present study it is impossible to make a decision between these interpretations. 
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SUMMARY 


The present study was concerned with the MMPI and Rosenzweig PFS pro- 
files of patients with duodenal ulcer, essential hypertension, and neuromuscular 
tension. The following results were obtained: 

(1) The Anxiety group had consistently greater mean scores on all of the 
MMPI scales than the Control group. 

(2) The Ulcer and Hypertension groups had significantly greater mean scores 
on the Hypochondriasis, Depression, Hysteria, and Psychopathic Deviate scales of 
the MMPI than the Control group. The mean Psychasthenia scale score of the 
Hypertension group was also significantly greater than that of the Control group. 

(3) The Anxiety group had significantly greater mean scores on the Depression, 
Psychopathic Deviate, Psychasthenia, Interest, Paranoia, Schizophrenia, and Hypo- 
mania scales of the MMPI than the Ulcer and Hypertension groups. 

(4) No significant differences between the groups on the Rosenzweig PFS var- 
iables could be demonstrated. 
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REGRESSION OR DISINTEGRATION IN SCHIZOPHRENIA? 


EDWARD M. SCOTT 


Eastern Oregon State Hospital 
Pendleton, Oregon 


INTRODUCTION 


Although no general theory of schizophrenia has yet been universally accepted, 
since the fundamental nature of schizophrenia has yet to be explained, there are 
several “earmarks.”” Among the postulated mechanisms are regression and dis- 
integration. Fenichel“) believes that Freud succeeded in understanding schizo- 
phrenia “by grouping all the phenomena around the basic concept of regression.” 
Beck? claims, following Jackson, that a mentally ill person is really ‘‘another kind 
of person”’; not the same person minus something. The schizophrenic is, says Beck, 
“in some phase of dissolution.’ 

Goldfarb suggests a provocative hypothesis; namely, that on the Rorschach, 
M-+ to M— in schizophrenia ‘‘would imply that the disease process involves dis- 
integration and qualitive change,’ whereas, M+ to FM would be ‘“‘mere regression 
to an infantile state.’”’ This investigator conceived of other symbols that may indicate 
Goldfarb’s hypothesis just as well; for example, M— versus FM—; M— versus m. 
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PROCEDURE 

Using this basic supposition as a hypothetical construct, the following investiga- 
tion was conducted. Thirty seven schizophrenic patients admitted to the Eastern 
Oregon State Hospital were administered the Rorschach and the Bender-Gestalt 
Tests. In order to utilize Pascal and Suttell’s“? scoring system on the latter test, 
only those patients between the ages of 15 and 50, who had attended high school, 
were included in the present study. 

Basically, the Rorschachs were scored according to the Beck”) method. In 
addition, Klopfer’s‘*) symbols FM and m were employed in order to investigate 
Goldfarb’s hypothesis. Finally, the scoring was tabulated on a summary sheet and 
Chi-squares were calculated on the symbols employed. 


RESULTS 


The results of the comparison of entries on the Rorschach are presented in 
Table 1. An examination of these data reveals the significant findings. The results 
of the Bender-Gestalt: (1) Scores between 50-72 (interpreted as “‘suspect’’ of needing 
psychiatric help) were obtained by 89°; of the population. (2) On “practical grounds’’ 
Pascal and Suttell suggest (since their scoring system isn’t “perfectly reliable’’) that 
a score of 60 be employed. Using this as a standard, 70°, of the population received, 
at least, a score of 60. 


TABLE 1. CoMPARISON OF RorscuHach SyMBoLS EMPLOYED 


Rorschach Numerical Chi-square 
Symbol Value . 





M FM 
M- vs. FM 63 81 2.250 
M- vs. FM- 63 19 25.600* 
FM vs. m | 81 26 26 .630* 
M- vs. m 63 26 10.362* 
M in A vs. FM 13 19 .473 
M wos. FM g 81 20.104* 


* = 1% level of confidence. 
DIscUssION 

Goldfarb’s hypothesis (M— to FM) was not supported by our results. However, 
the present investigator feels that a comparison of M— and FM— is more appro- 
priate, or more “‘telling.’”? When this was done a clear difference resulted. The diffi- 
culty is in interpreting FM—. This investigator found nothing in the literature on 
this point. In Klopfer’s“? new text there is no discussion of FM—. Logically and a 
priori, FM— should signify something specific, as does M—. By speculation, it 
might augur to a “full blown” regression. 

The fact that no significant difference was found between M in A and FM— 
must be kept in mind. There is a possibility that both connote the same psycho- 
dynamics. Beck? states that M in A is ‘“‘a repressed or heavily disguised M,” and 
may therefore “‘stand very near to the dream.’”’ Schachtel® states that covert or 
overt identification “‘cuts across the distinction between human and animal move- 
ment.”” Whatever interpretation proposed for FM—, M— is dominant, and accord- 
ing to the major premise of this study, indicates disintegration. 

The predominance of FM over m probably points to an ‘immediate need sys- 
tem.” Following Deri’s“ opinion, it would equate with an open K individual on-the 
Szondi Test indicating primary narcissism. The fact that M— when compared to- 
m results in a significantly higher M—, may suggest a lack of tension in the psychotic 
state—that the patient has “given up;” at least in the terminal stage. 
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Finally, our Bender-Gestalt records, suggest a lack of ‘‘ego strength,” as the 
term is used by Pascal and Suttell “; namely, an attitude toward reality that con- 
notes “functional impairment.”’ These same authors in an excellent study) where 
the Bender-Gestalt was used in testing children, normals, neurotics and schizo- 
phrenics conclude that, ‘‘the schizophrenic group in our study, then, was ‘regressive’ 
only on the items, the reproduction of which seems to depend on learning to a greater 
extent than on maturation capacity.” 


CoNCLUSION AND SUMMARY 


The Rorshach and Bender-Gestalt Tests were administered to 37 schizophren- 
ics routinely admitted to the Eastern Oregon State Hospital. In order to utilize 
Pascal and Suttell’s scoring of the latter test, only those patients between 15 and 50 
with a high school education were used. The hypothetical premise investigated was 
that of disintegration or regression in schizophrenia by means of the above tests. 
The results, based on M— as compared to FM—; and M— matched with m seem to 
indicate, since M— occurred significantly more often, that schizophrenia is more of a 
disintegrative nature than of a regressive nature, though this latter component does 
appear (FM occurred significantly more frequent than m.) Further investigation 


would help to clarify the results of this investigation, especially the interpretation 
of FM—. 
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EDITORIAL OPINION. 





AT THE SIGN OF THE CULT 


In sociological terms, a cult is a group holding an exclusive sacred ideology with 
a series of rites centering about their sacred symbols. The members of a cult typically 
show an almost religious veneration for their ideas, symbols, rites, authority figures 
and particularly for the originators of the system. Usually, the peculiar ideology of 
the cult involves some supposedly superior concept of things with many implications 
for ideo-motor behavior involving the acting-out of principles. In clinical applied 
science, a cult may be defined as a group of practitioners subscribing to the teachings 
of some school or system usually founded by some master who is revered with a de- 
votion which can border on fanaticism, but whose teachings are accepted more or 
less on faith and without any solid experimental-statistical validation. Such cults 
quickly become organized in more or less formal patterns, attract large numbers of 
followers, and may eventually acquire great prestige, professional and economic 
strength as its adherents attain positions of power. Unfortunately, however, such 


cults impede the true progress of science and deserve to be unmasked as soon as they 
are detected. 


It is interesting to study the historical development of pseudoscientific cults 
and schools which perhaps have appeared with greater incidence in psychological 
fields than in medicine. A new cult is established when some new theory or system 
receives mass acceptance in the absence of any sound validating evidence. The orig- 
inator of the ideas is quickly elevated to the status of a “‘master’’ and a “‘school”’ of 
willing pupils soon begins to disseminate his teachings as widely as possible. In 
the absence of any effective scientific validation or legal control, incompetent prac- 
titioners with all degrees of lack of training enter the field and begin practicing or 
teaching without regulation. Soon the cult acquires a formal organization with 
establishment of a ‘society’, regular meetings, publication of journals and books, 
maneuverings for power, and attempts to convert the unbelieving. Because of the 
universal existence of factors of suggestion, natural remission of symptoms, pure 
coincidence and other undifferentiated factors, the new cult soon gathers super- 
ficially imposing anecdotal evidence of its successes. Usually it is a tedious and 
difficult process to debunk the cult and overcome its prestige, so that many years 


may pass before it is discredited and even then it may still retain many loyal ad- 
herents. 


Because we feel that it is very important to identify and discredit psychological 
cults as soon as they appear, we have gathered a list of signs or criteria of cultism 
which may be used as diagnostic evidence that something is rotten in the state of 
Psychologia. Pseudoscientific psychological cults are characterized by: 


1. The advancement of imaginative new theories without any systematic attempt to integrate 
their basic principles with the established body of scientific facts. 
9) 


2. The absence of any consistent attempt at scientific validation using experimental-statistical 
methods. Premature claims and publication of results. 


3. The invention of new symbols, terms, vocabularies or rituals which are usually not opera- 
tionally defined or related to established | meanings. Semantic confusion results from assigning 
new meanings to old terms or creating new terms of unclear reference. 


4. Communication between cult members is usually carried out in terms of the new cult lang- 
uage which may border perilously near gibberish. 


5. The literature of the cult is usually completely inbred with failure to mention or acknowledge 
work done elsewhere. 
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6. Failure to give proper credit to the original sources of the same or similar ideas. In an at- 
tempt to improve the internal consistency of their own system, old ideas are disguised under 
new terminology and the whole paraded as a new discovery. 


7. Impermeability of the cult ideology to external criticisms. Inconsistent ideas are repressed 
or suppressed. 


8. Magnification of positive results and ignoral of negative results. 
9. Undue glorification and adulation of cult originators and authorities. 


10. Failure of cult originators and authorities to willfully admit errors or progressive modifica- 
tion of the system. Absence of self-criticism. 


11. Dogmatism and authoritarianism. The words of the masters are incorporated into “bibles’”’ 
from which no deviation is permitted. Ritualistic practices. 


12. Unwillingness to submit system to outside investigation. 


13. Transmission of the cult system from generation to generation by secret methods which are 
not generally made available to all comers without charge. Secrecy is utilized to prevent out- 
side investigation and debunking. 


14. Undemocratic attempts to secure power and control professional resources. 


15. Development of grandiose ideas such as that the cult is a sort of ‘elite guard’, the holders 
of the only “true” faith, ete. 


16. Charging excessive prices for services. Extortionate prices for quack remedies. 


17. Overspecialization by students of the cult who are not exposed to other viewpoints or the 
scientific method in general. 


18. “Splintering” effects as rebellious students renounce some aspect of the teachings of the 
master to establish their own schools, 


19. The absence of any “team” approach because cult members are unable to cooperate with 
anyone who thinks differently. 


20. The “crusading” approach as cult members proselytize and seek new adherents. 


21. Extreme ego involvement and emotional over-reactivity of cult members in relation to the 
tenets of their system. 


Not all cults show all the signs listed above. Some groups may be cultish only 


in part of their activities, showing all degrees from complete quackery to near scien- 
tific respectability. Often cults develop about leaders who were in no respect cultish 
themselves and, as is known in religions, no one is more devout than the convert. 
We have often speculated that Freud, being as scientifically advanced as he was for 


his 


day and age, would turn in his grave if he knew of the degree of cultism being 


perpetrated in his name. It is not our purpose here to identify the various forms of 
psychological cultism which may be discerned on our present scene. Let the reader 
himself seek to discriminate between fact and fancy in all applications of psychologi- 
cal 





practice. 
F. C. T. 
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